<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          Python數(shù)據(jù)分析之時間處理技巧1,2,3

          共 5824字,需瀏覽 12分鐘

           ·

          2020-08-04 01:01


          ◆?◆?◆ ?◆?


          本文目標

          1. 學會使用時間索引,時間點切片;
          2. 學會轉(zhuǎn)換時間格式,時間格式化;
          3. 學會計算時間差值,時間段計算;
          4. 學會增減時間量值,時間點計算。


          時間處理技巧1


          #?讀取兩份相同的數(shù)據(jù)df1,df2,做實驗對照In [1]:import pandas as pddf1 = pd.read_clipboard()df2 = pd.read_clipboard()df1.head()

          Out[1]:


          TR_DTTR_TMCUST_IDCUST_NAMECARD_IDTR_TYPEOPP_IDTR_AMTOPP_NAMECOMMENT
          02017/4/133:23:266242668433王丹6242668433D43893207005023楊利一個
          12019/8/2411:27:21645470876江桂珍645470876C31605337215023吳英還有,決定,只是,客戶,基本
          22013/3/722:03:108043584726楊璐8043584726C7845156675024鐘楠數(shù)據(jù),管理,什么
          32013/3/120:02:404580505126王龍4580505126C77459884735026許晨網(wǎng)上,等級,文件,資源
          42019/2/90:50:287413364874賈桂蘭7413364874D78308304785041楊丹丹關于,進入,發(fā)現(xiàn)


          In [2]:df1.shape


          Out[2]:

          (5450,?10)

          In [3]:df1.index


          Out[3]:

          RangeIndex(start=0,?stop=5450,?step=1)

          In [4]:df1.columns.tolist()


          Out[4]:

          ['TR_DT',
          ?'TR_TM',
          ?'CUST_ID',
          ?'CUST_NAME',
          ?'CARD_ID',
          ?'TR_TYPE',
          ?'OPP_ID',
          ?'TR_AMT',
          ?'OPP_NAME',
          ?'COMMENT']

          #?查看info(),可知TR_DT,TR_TM列均不是時間格式In [5]:df1.info()


          RangeIndex:?5450?entries,?0?to?5449
          Data?columns?(total?10?columns):
          TR_DT????????5450?non-null?object
          TR_TM????????5450?non-null?object
          CUST_ID??????5450?non-null?int64
          CUST_NAME????5450?non-null?object
          CARD_ID??????5450?non-null?int64
          TR_TYPE??????5450?non-null?object
          OPP_ID???????5450?non-null?int64
          TR_AMT???????5450?non-null?int64
          OPP_NAME?????5450?non-null?object
          COMMENT??????5450?non-null?object
          dtypes:?int64(4),?object(6)
          memory?usage:?425.9+?KB


          時間處理1


          1.在平常工作過程中,我們一般是采用“日期 時間”這樣的格式,基本上不會把日期和時間分別存儲,所以首先把日期和時間列合并起來。


          # 去掉某兩列,再合并為一列In [6]:dt = df1.pop('TR_DT') + ' ' + df1.pop('TR_TM')dt.tail()


          Out[6]:

          5445????2009/4/28?13:47:13
          5446????2008/4/26?18:16:24
          5447?????2001/8/7?20:42:04
          5448?????2017/5/17?6:14:07
          5449????2001/1/31?10:17:27
          dtype:?object


          2.將合并產(chǎn)生的新列通過insert方法插入到指定位置。


          # 按照列索引位置插入新列In [7]:df1.insert(0,'DT',dt)# 后面有用處,對照df2.insert(10,'DT',dt)df1.head()


          Out[7]:


          DTCUST_IDCUST_NAMECARD_IDTR_TYPEOPP_IDTR_AMTOPP_NAMECOMMENT
          02017/4/13 3:23:266242668433王丹6242668433D43893207005023楊利一個
          12019/8/24 11:27:21645470876江桂珍645470876C31605337215023吳英還有,決定,只是,客戶,基本
          22013/3/7 22:03:108043584726楊璐8043584726C7845156675024鐘楠數(shù)據(jù),管理,什么
          32013/3/1 20:02:404580505126王龍4580505126C77459884735026許晨網(wǎng)上,等級,文件,資源
          42019/2/9 0:50:287413364874賈桂蘭7413364874D78308304785041楊丹丹關于,進入,發(fā)現(xiàn)


          3.查看新增加的DT列的數(shù)據(jù)類型(此處有伏筆)。


          In [8]:df1['DT'].dtype


          Out[8]:

          dtype('O')


          4.設置新增加的DT列為行索引,并隨機抽10條查看。


          In [9]:df1.set_index('DT', inplace = True)df1.sample(10)


          Out[9]:


          CUST_IDCUST_NAMECARD_IDTR_TYPEOPP_IDTR_AMTOPP_NAMECOMMENT
          DT







          2012/1/5 0:45:463808833244季軍3808833244D12786288036197421蔣紅霞人員
          2011/12/2 19:24:324699907312陳飛4699907312C170049752473256722秦建國同時,當然,文件,自己
          2007/8/19 17:59:129478303971李秀英9478303971D855302212464680劉敏作品
          2014/7/13 15:15:414789182798李雪梅4789182798C366637505012826夏娜查看,美國,知道
          2000/10/10 0:10:233166977982謝博3166977982C6648979252200406周柳東西
          2013/5/13 6:08:157197382604劉陽7197382604D633182487942848吳淑華通過,感覺,必須,閱讀,是否
          2010/12/20 16:34:451818523684蔣旭1818523684C347921594516355353孔暢有關,一次,國際
          2010/5/24 8:05:369476622720楊杰9476622720D35444307653502909范倩包括,使用,希望,其實,支持
          2011/12/28 10:23:135685935010伍玉蘭5685935010C61646658629395238劉秀芳單位
          2014/1/11 4:31:223859582722楊紅梅3859582722C3875810173759456035司俊最大,直接,教育,標題


          錯誤?示范


          當我們把DT(日期&時間)列設為行索引后,去查看2019年全年的數(shù)據(jù),此時發(fā)生了如下的報錯。


          In [10]:df1['2019']
          ---------------------------------------------------------------------------KeyError??????????????????????????????????Traceback?(most?recent?call?last)E:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py?in?get_loc(self,?key,?method,?tolerance)???2656?????????????try:->?2657?????????????????return?self._engine.get_loc(key)???2658?????????????except?KeyError:pandas/_libs/index.pyx?in?pandas._libs.index.IndexEngine.get_loc()pandas/_libs/index.pyx?in?pandas._libs.index.IndexEngine.get_loc()pandas/_libs/hashtable_class_helper.pxi?in?pandas._libs.hashtable.PyObjectHashTable.get_item()pandas/_libs/hashtable_class_helper.pxi?in?pandas._libs.hashtable.PyObjectHashTable.get_item()KeyError:?'2019'

          During?handling?of?the?above?exception,?another?exception?occurred:KeyError??????????????????????????????????Traceback?(most?recent?call?last)?in?---->?1?df1['2019']E:\Anaconda3\lib\site-packages\pandas\core\frame.py?in?__getitem__(self,?key)???2925?????????????if?self.columns.nlevels?>?1:???2926?????????????????return?self._getitem_multilevel(key)->?2927?????????????indexer?=?self.columns.get_loc(key)???2928?????????????if?is_integer(indexer):???2929?????????????????indexer?=?[indexer]E:\Anaconda3\lib\site-packages\pandas\core\indexes\base.py?in?get_loc(self,?key,?method,?tolerance)???2657?????????????????return?self._engine.get_loc(key)???2658?????????????except?KeyError:->?2659?????????????????return?self._engine.get_loc(self._maybe_cast_indexer(key))???2660?????????indexer?=?self.get_indexer([key],?method=method,?tolerance=tolerance)???2661?????????if?indexer.ndim?>?1?or?indexer.size?>?1:pandas/_libs/index.pyx?in?pandas._libs.index.IndexEngine.get_loc()pandas/_libs/index.pyx?in?pandas._libs.index.IndexEngine.get_loc()pandas/_libs/hashtable_class_helper.pxi?in?pandas._libs.hashtable.PyObjectHashTable.get_item()pandas/_libs/hashtable_class_helper.pxi?in?pandas._libs.hashtable.PyObjectHashTable.get_item()KeyError:?'2019'

          很明顯,沒有把日期時間格式的數(shù)據(jù)轉(zhuǎn)換為日期時間數(shù)據(jù)。


          時間處理技巧2


          在處理帶有日期時間格式的數(shù)據(jù)時,尤其是日期時間要參與計算,且是分析過程中的一個重要維度,我建議你把日期時間格式的數(shù)據(jù)轉(zhuǎn)換為日期時間數(shù)據(jù)并設為行索引,好處多多。


          1.重設索引。


          In [11]:df1.reset_index(inplace = True)df1.head()


          Out[11]:


          DTCUST_IDCUST_NAMECARD_IDTR_TYPEOPP_IDTR_AMTOPP_NAMECOMMENT
          02017/4/13 3:23:266242668433王丹6242668433D43893207005023楊利一個
          12019/8/24 11:27:21645470876江桂珍645470876C31605337215023吳英還有,決定,只是,客戶,基本
          22013/3/7 22:03:108043584726楊璐8043584726C7845156675024鐘楠數(shù)據(jù),管理,什么
          32013/3/1 20:02:404580505126王龍4580505126C77459884735026許晨網(wǎng)上,等級,文件,資源
          42019/2/9 0:50:287413364874賈桂蘭7413364874D78308304785041楊丹丹關于,進入,發(fā)現(xiàn)


          2.轉(zhuǎn)換類型。


          #?使用pd.to_datetime進行類型轉(zhuǎn)換In [12]:%%timeitdf1['DT'] = pd.to_datetime(df1['DT'],format='%Y/%m/%d %H:%M:%S')df1['DT'].dtype


          Out[12]:

          dtype('

          3.使用年份索引。

          #?這就是轉(zhuǎn)換為日期時間類型的好處之一In [13]:df1[df1['DT'].dt.year==2019]


          Out[13]:


          DTCUST_IDCUST_NAMECARD_IDTR_TYPEOPP_IDTR_AMTOPP_NAMECOMMENT
          12019-08-24 11:27:21645470876江桂珍645470876C31605337215023吳英還有,決定,只是,客戶,基本
          42019-02-09 00:50:287413364874賈桂蘭7413364874D78308304785041楊丹丹關于,進入,發(fā)現(xiàn)
          142019-06-20 04:15:404137413510劉艷4137413510C14992827045121謝玲大小
          192019-06-05 23:57:554081613672鄭斌4081613672C33293496235150廖旭經(jīng)營
          282019-05-03 12:13:423118752384申勇3118752384D15063354585296劉麗當然,大家,說明
          1002019-03-14 05:25:048475076158范濤8475076158D90843798485918曲丹學習
          1492019-08-29 15:49:22681877640王倩681877640C73502586546308葛雪梅規(guī)定,設備,還有
          1602019-03-25 19:45:559785226364閻鵬9785226364D97745625166413劉濤企業(yè)
          1822019-04-22 21:07:099359655641彭麗9359655641C75549698066589鞠建平一次,自己,作品
          1912019-05-18 23:15:567319489752孫欣7319489752C56698845546646吳玉英根據(jù),發(fā)生,發(fā)生
          2292019-02-15 08:44:462971072241帥秀珍2971072241C49568043796909陳冬梅業(yè)務,電影,一下,帖子,不過
          2572019-04-23 07:25:454592989900陳飛4592989900D65466147157163張金鳳手機
          2682019-01-16 01:20:399188311460慕玉珍9188311460C20489703837250何冬梅項目,女人,作為,行業(yè),一定
          3262019-07-11 13:13:107946691979李淑蘭7946691979C61968017657757田林幫助,女人,只有
          3662019-01-07 12:46:513350673324祝海燕3350673324D34679045388156張玉英的話,全部,廣告,點擊
          4082019-01-22 09:07:075294695875王斌5294695875D86944742008555曹紅男人,回復,一點
          4182019-08-29 13:01:15127991302韓偉127991302C12782609218634劉峰之后
          4272019-03-05 11:03:038784664466吳佳8784664466C22002508238760劉桂芳深圳
          4672019-02-09 07:13:063633288148崔林3633288148D22643360769038張桂珍朋友
          5342019-01-08 02:53:516952087281黃桂蘭6952087281C10631919159652郭旭系統(tǒng)
          5512019-07-23 19:16:212435575150趙淑華2435575150D9485272269785高亮社會,所有,日本,生產(chǎn),音樂
          5582019-01-20 05:16:196561467782施桂蘭6561467782D23954912149841楊波電影,狀態(tài),電子
          6982019-01-16 14:21:032741356324馬彬2741356324C826807731620144仲建軍計劃,密碼,我們,操作,發(fā)現(xiàn)
          7162019-06-15 19:52:469733407575唐玉蘭9733407575D994864663721598曾浩用戶
          7172019-01-09 23:53:529475116124張秀珍9475116124C898828817521613許鳳英發(fā)表
          7332019-02-25 01:25:412043697694史蘭英2043697694C292703178522532陸春梅不要,市場,一樣
          7562019-07-17 23:22:425133778235張浩5133778235D981089727525040孫潔提供
          7712019-04-17 16:19:36528291341劉海燕528291341C302097701325959黃丹丹文化
          8092019-03-07 18:48:375762950428韓洋5762950428C655727901228584盛輝你們,評論,網(wǎng)站
          8732019-06-11 18:29:46193388281宋秀英193388281C346939234133955張帥免費,或者,各種
          ..............................
          47162019-04-01 05:01:22407191975陳俊407191975C4100230801197165137滕晨推薦
          47222019-04-11 09:52:152830451569吉龍2830451569C2084696072205077794張建平歡迎
          47522019-03-24 17:00:347365738066劉偉7365738066C6160001466239114293劉斌政府
          47592019-01-28 05:53:126523292694黎秀芳6523292694C3802748920246025632王倩深圳,他們,國內(nèi)
          47672019-04-11 01:55:528065795703張秀梅8065795703D2896766255253739374袁雷以下,歷史,生產(chǎn)
          48472019-05-19 19:31:079200616229李桂花9200616229D5347078010342354824宮博您的
          48952019-06-27 17:27:203828161273徐峰3828161273D3811921325395016562盧云電話
          49442019-02-23 02:25:259460657215姚鳳蘭9460657215C7094387010442623965馬鳳英還是,朋友,喜歡,用戶,行業(yè)
          49532019-03-27 13:43:15531424448邵健531424448C1146425436451759384宋秀榮類型
          49812019-08-02 10:33:523010220054洪飛3010220054D4122269062476878314張瑩密碼
          49962019-07-08 09:13:598359281020莫瑩8359281020C8768783410498431924沈琴會員
          50032019-07-06 22:19:582028094118羅琴2028094118D5979852217506641875趙玉英部分
          50142019-03-18 22:19:11998069133徐玉珍998069133C5799478716520102884余海燕重要,成為,服務,同時,登錄
          50262019-03-19 23:16:444933056263羅寧4933056263D8200499382530328985修琳不能
          50552019-06-16 16:29:214663395054谷霞4663395054D4244880246553983223余明全部,這么,進入,時候,支持
          50642019-09-05 05:18:284874710828易超4874710828D3119882013560089617趙秀榮國家,幫助,公司
          51242019-02-17 16:35:275127266304黃雪5127266304C1491876185625620297汪建平公司,您的,進入,數(shù)據(jù)
          51362019-05-20 12:45:333737874458章瑞3737874458C5820740837639509111唐丹發(fā)布,記者,時候,上海,知道
          51522019-08-22 02:14:059880973840李明9880973840C4753187380650723952李穎數(shù)據(jù),位置,是否,一起,軟件
          51872019-05-05 01:51:285969612341嚴芳5969612341D8469711917696169555韓丹丹非常
          51942019-04-02 09:49:491691148700張秀蘭1691148700C1505305046713755945曹春梅注冊,中心,合作
          51962019-03-11 20:20:195261976352劉小紅5261976352D4116060655718706125陳穎狀態(tài),但是,會員,不能,今年
          52202019-06-08 22:31:532226838817藍云2226838817C5109722835741192242梅軍控制
          52382019-06-04 03:22:347560543392魏秀榮7560543392D8790965520769273079王偉工作,原因,國際,能力
          52422019-08-22 20:50:281739867400張建國1739867400C2915260463770790503趙桂芳詳細
          52552019-03-25 03:44:558833700305劉瑞8833700305C5048275295787623534李瑞一些,情況,標準,方式
          52802019-03-02 20:13:409245143007王佳9245143007C1827932520818620397王軍不過
          53532019-06-24 18:35:188712004123趙玉8712004123C2844248173894776758姜帆你們,感覺,業(yè)務,什么
          53832019-03-30 04:21:586736717536華秀蘭6736717536C1040628699925870996陳成下載
          54142019-03-27 21:25:502777228832楊秀梅2777228832D1814721576962375431高強回復,顯示,這里

          193 rows × 9 columns


          # 查看對照組In [14]:df2.head()


          Out[14]:


          TR_DTTR_TMCUST_IDCUST_NAMECARD_IDTR_TYPEOPP_IDTR_AMTOPP_NAMECOMMENTDT
          02017/4/133:23:266242668433王丹6242668433D43893207005023楊利一個2017/4/13 3:23:26
          12019/8/2411:27:21645470876江桂珍645470876C31605337215023吳英還有,決定,只是,客戶,基本2019/8/24 11:27:21
          22013/3/722:03:108043584726楊璐8043584726C7845156675024鐘楠數(shù)據(jù),管理,什么2013/3/7 22:03:10
          32013/3/120:02:404580505126王龍4580505126C77459884735026許晨網(wǎng)上,等級,文件,資源2013/3/1 20:02:40
          42019/2/90:50:287413364874賈桂蘭7413364874D78308304785041楊丹丹關于,進入,發(fā)現(xiàn)2019/2/9 0:50:28


          In [15]:df2.shape


          Out[15]:

          (5450,?11)


          In [16]:%%timeitpd.to_datetime(df2['TR_DT'])
          4.78?ms?±?398?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?100?loops?each)


          In [17]:%%timeitdf2['TR_DT'] = pd.to_datetime(df2['TR_DT'],format = '%Y/%m/%d')
          618?μs?±?16?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?1000?loops?each)


          4.查看唯一值個數(shù)。(有人說,時間有重復的值設為行索引計算時會出錯)


          In [18]:df2['TR_DT'].unique().__len__()# 結(jié)果為:3841,很明顯有重復值。


          Out[18]:

          3841


          5.設置行索引。(此時數(shù)據(jù)類型為日期時間格式,與上面df1做對照)


          In [19]:df2.set_index('TR_DT',inplace =True)df2.tail()


          Out[19]:


          TR_TMCUST_IDCUST_NAMECARD_IDTR_TYPEOPP_IDTR_AMTOPP_NAMECOMMENTDT
          TR_DT









          2009-04-2813:47:135256787952徐桂蘭5256787952D6107428542993450936姬春梅如果2009/4/28 13:47:13
          2008-04-2618:16:246028210306馬霞6028210306C1533609481995224040李利其中2008/4/26 18:16:24
          2001-08-0720:42:049012812841李麗麗9012812841D4561419087995968641王穎城市2001/8/7 20:42:04
          2017-05-176:14:079929073172馮麗娟9929073172D6425254614996109384陳金鳳那么,沒有,開發(fā),今天,深圳2017/5/17 6:14:07
          2001-01-3110:17:277938005479楊慧7938005479C3678877832999526658姚梅支持,看到,國內(nèi),專業(yè)2001/1/31 10:17:27


          #?由于前面設置了數(shù)據(jù)類型為日期時間類型,所以可以直接查看2019年全年的數(shù)據(jù)In [20]:df2['2019'].head()


          Out[20]:


          TR_TMCUST_IDCUST_NAMECARD_IDTR_TYPEOPP_IDTR_AMTOPP_NAMECOMMENTDT
          TR_DT









          2019-08-2411:27:21645470876江桂珍645470876C31605337215023吳英還有,決定,只是,客戶,基本2019/8/24 11:27:21
          2019-02-090:50:287413364874賈桂蘭7413364874D78308304785041楊丹丹關于,進入,發(fā)現(xiàn)2019/2/9 0:50:28
          2019-06-204:15:404137413510劉艷4137413510C14992827045121謝玲大小2019/6/20 4:15:40
          2019-06-0523:57:554081613672鄭斌4081613672C33293496235150廖旭經(jīng)營2019/6/5 23:57:55
          2019-05-0312:13:423118752384申勇3118752384D15063354585296劉麗當然,大家,說明2019/5/3 12:13:42


          # 查看2018年8月全月數(shù)據(jù)In [21]:df2['2018-08']


          Out[21]:


          TR_TMCUST_IDCUST_NAMECARD_IDTR_TYPEOPP_IDTR_AMTOPP_NAMECOMMENTDT
          TR_DT









          2018-08-091:18:594461213772張想4461213772D389243939652554趙建國不斷2018/8/9 1:18:59
          2018-08-021:18:157608142050余欣7608142050D154316267559720李晨實現(xiàn),今天,無法2018/8/2 1:18:15
          2018-08-020:59:119466501052哈秀芳9466501052C896010727080014董桂芝之后,查看,這種,中文,行業(yè)2018/8/2 0:59:11
          2018-08-1017:53:555892229304吳玲5892229304C473641905594136張紅霞查看2018/8/10 17:53:55
          2018-08-0622:52:193265289234張萍3265289234D495817780513761宋淑英過程,產(chǎn)品,就是,時候2018/8/6 22:52:19
          2018-08-2621:29:056881709860吳淑英6881709860C1955037839614626許彬來源2018/8/26 21:29:05
          2018-08-2622:22:367596300602姜陽7596300602D5586089463741498趙楠來源2018/8/26 22:22:36
          2018-08-2812:55:447354511095劉淑英7354511095D7543558124937545張暢現(xiàn)在,可能,部門2018/8/28 12:55:44
          2018-08-1910:03:367608567890黃彬7608567890C65712236536964670王靜類別,合作,到了,文化,計劃2018/8/19 10:03:36
          2018-08-2421:13:362862928104何瑞2862928104C94758685199728345陳玉梅發(fā)現(xiàn),參加,質(zhì)量2018/8/24 21:13:36
          2018-08-1116:07:186373291096劉寧6373291096C481308366836566134石娜價格2018/8/11 16:07:18
          2018-08-133:47:447375646786楊淑珍7375646786C504570249184429974金雪梅能夠,投資,其他,詳細2018/8/13 3:47:44
          2018-08-315:23:41863602136黃峰863602136D246635986493340210陳紅問題,自己,自己,可以,建設2018/8/31 5:23:41
          2018-08-193:15:356795273748姜麗華6795273748D4530696397100190803朱想文化2018/8/19 3:15:35
          2018-08-3022:17:052416300510曹倩2416300510D5019740318296681657裴琴個人2018/8/30 22:17:05
          2018-08-126:36:419957924298劉文9957924298C1242463232619712915劉雪網(wǎng)上,的是,更新,論壇,加入2018/8/12 6:36:41
          2018-08-070:52:431797969934劉玉梅1797969934D5461654240721062764黃洋應該,今天,還有2018/8/7 0:52:43
          2018-08-038:07:424163264148張歡4163264148D7002757167991012528張淑英而且2018/8/3 8:07:42


          # 好處多多,還可以查看一段時間范圍內(nèi)的數(shù)據(jù)In [22]:df2.sort_index(inplace  = True)
          In [23]:df2['2000-01-01':'2000-01-06']


          Out[23]:


          TR_TMCUST_IDCUST_NAMECARD_IDTR_TYPEOPP_IDTR_AMTOPP_NAMECOMMENTDT
          TR_DT









          2000-01-0113:37:418379638746陳梅8379638746C567568598057767819邢霞價格,社區(qū),控制,完成,品牌2000/1/1 13:37:41
          2000-01-052:41:28530197567易寧530197567C76204664436193李金鳳質(zhì)量2000/1/5 2:41:28
          2000-01-0520:11:053113560650梁秀云3113560650C96408830662212113王波網(wǎng)上2000/1/5 20:11:05


          6.截取一定時間范圍的數(shù)據(jù)。


          #?保留2019年9月及之后的數(shù)據(jù)In [24]:after_09 = df2.truncate(before='2019-09')after_09


          Out[24]:


          TR_TMCUST_IDCUST_NAMECARD_IDTR_TYPEOPP_IDTR_AMTOPP_NAMECOMMENTDT
          TR_DT









          2019-09-0113:13:306612959791龐婷婷6612959791D646299275193402寧霞不過,資料,還有,來自2019/9/1 13:13:30
          2019-09-0114:30:186082904803阮浩6082904803C46643869471274246王倩公司,地方,世界2019/9/1 14:30:18
          2019-09-0420:43:035297235591鐘陽5297235591D55859610333653940陸佳參加,出現(xiàn),關系2019/9/4 20:43:03
          2019-09-055:18:284874710828易超4874710828D3119882013560089617趙秀榮國家,幫助,公司2019/9/5 5:18:28


          #?對照組:并不會按照時間范圍截取,不過可以按照索引截取In [25]:df1['DT'].truncate(after = 9)


          Out[25]:

          0???2017-04-13?03:23:26
          1???2019-08-24?11:27:21
          2???2013-03-07?22:03:10
          3???2013-03-01?20:02:40
          4???2019-02-09?00:50:28
          5???2000-09-22?01:57:51
          6???2013-12-30?20:14:52
          7???2010-02-14?14:13:39
          8???2005-09-11?21:58:32
          9???2014-11-28?02:23:22
          Name:?DT,?dtype:?datetime64[ns]

          7.計算時間間隔。

          # 時間間隔計算,必須類型相同,結(jié)果很難看In [26]:delta = pd.datetime.now() - df1['DT'].truncate(after = 9)delta


          Out[26]:

          0???1034?days?11:46:11.011366
          1????171?days?03:42:16.011366
          2???2531?days?17:06:27.011366
          3???2537?days?19:06:57.011366
          4????367?days?14:19:09.011366
          5???7081?days?13:11:46.011366
          6???2233?days?18:54:45.011366
          7???3649?days?00:55:58.011366
          8???5265?days?17:11:05.011366
          9???1901?days?12:46:15.011366
          Name:?DT,?dtype:?timedelta64[ns]


          # 時間間隔結(jié)果很難看,需要進一步處理In [27]:delta.dt.days, delta.dt.seconds


          Out[27]:

          (0????1034
          ?1?????171
          ?2????2531
          ?3????2537
          ?4?????367
          ?5????7081
          ?6????2233
          ?7????3649
          ?8????5265
          ?9????1901
          ?Name:?DT,?dtype:?int64,?0????42371
          ?1????13336
          ?2????61587
          ?3????68817
          ?4????51549
          ?5????47506
          ?6????68085
          ?7?????3358
          ?8????61865
          ?9????45975
          ?Name:?DT,?dtype:?int64)


          8.時間增量計算。


          # 時間增量一周In [28]:df1['DT'].head() + pd.Timedelta(days = 7)


          Out[28]:

          0???2017-04-20?03:23:26
          1???2019-08-31?11:27:21
          2???2013-03-14?22:03:10
          3???2013-03-08?20:02:40
          4???2019-02-16?00:50:28
          Name:?DT,?dtype:?datetime64[ns]


          9.求年份。


          In [29]:df1['DT'].head().dt.year


          Out[29]:

          0????2017
          1????2019
          2????2013
          3????2013
          4????2019
          Name:?DT,?dtype:?int64

          10.求月份。


          In [30]:df1['DT'].head().dt.month


          Out[30]:

          0????4
          1????8
          2????3
          3????3
          4????2
          Name:?DT,?dtype:?int64


          11.求天數(shù)。


          In [31]:df1['DT'].head().dt.day


          Out[31]:

          0????13
          1????24
          2?????7
          3?????1
          4?????9
          Name:?DT,?dtype:?int64


          時間處理技巧3


          有時候,我們又需要把合并的日期、時間分隔開,怎么處理呢?!

          # 查看數(shù)據(jù)類型In [32]:df2['DT'].dtype


          Out[32]:

          dtype('O')


          方法一:自定義日期、時間格式實現(xiàn)


          1.轉(zhuǎn)換類型。


          In [33]:pd.to_datetime(df2['DT'],format='%Y/%m/%d %H:%M:%S')


          Out[33]:

          TR_DT
          2000-01-01???2000-01-01?13:37:41
          2000-01-05???2000-01-05?02:41:28
          2000-01-05???2000-01-05?20:11:05
          2000-01-08???2000-01-08?12:24:21
          2000-01-10???2000-01-10?13:47:24
          2000-01-11???2000-01-11?09:57:18
          2000-01-13???2000-01-13?05:14:15
          2000-01-13???2000-01-13?15:11:01
          2000-01-14???2000-01-14?03:14:46
          2000-01-15???2000-01-15?01:33:14
          2000-01-20???2000-01-20?17:23:17
          2000-01-20???2000-01-20?16:03:59
          2000-01-21???2000-01-21?19:21:49
          2000-01-22???2000-01-22?09:37:04
          2000-01-22???2000-01-22?02:32:30
          2000-01-23???2000-01-23?01:04:06
          2000-01-23???2000-01-23?21:27:52
          2000-01-25???2000-01-25?09:45:24
          2000-01-31???2000-01-31?10:07:12
          2000-01-31???2000-01-31?18:38:42
          2000-01-31???2000-01-31?12:15:48
          2000-02-01???2000-02-01?00:19:37
          2000-02-02???2000-02-02?11:06:32
          2000-02-03???2000-02-03?20:01:44
          2000-02-04???2000-02-04?01:14:56
          2000-02-04???2000-02-04?15:09:53
          2000-02-07???2000-02-07?04:01:30
          2000-02-08???2000-02-08?15:10:03
          2000-02-09???2000-02-09?18:34:20
          2000-02-09???2000-02-09?12:45:29
          ?????????????????????...????????
          2019-08-01???2019-08-01?02:01:35
          2019-08-01???2019-08-01?06:00:56
          2019-08-02???2019-08-02?10:33:52
          2019-08-03???2019-08-03?21:39:52
          2019-08-04???2019-08-04?14:57:07
          2019-08-04???2019-08-04?14:49:10
          2019-08-04???2019-08-04?14:33:55
          2019-08-07???2019-08-07?12:59:10
          2019-08-09???2019-08-09?02:49:38
          2019-08-10???2019-08-10?02:24:42
          2019-08-12???2019-08-12?15:29:54
          2019-08-13???2019-08-13?07:50:24
          2019-08-14???2019-08-14?13:37:55
          2019-08-16???2019-08-16?07:31:32
          2019-08-20???2019-08-20?18:15:54
          2019-08-22???2019-08-22?20:50:28
          2019-08-22???2019-08-22?02:14:05
          2019-08-23???2019-08-23?22:38:41
          2019-08-24???2019-08-24?11:27:21
          2019-08-24???2019-08-24?15:08:09
          2019-08-26???2019-08-26?10:10:34
          2019-08-28???2019-08-28?05:55:17
          2019-08-29???2019-08-29?13:01:15
          2019-08-29???2019-08-29?15:49:22
          2019-08-29???2019-08-29?10:07:08
          2019-08-31???2019-08-31?01:22:46
          2019-09-01???2019-09-01?13:13:30
          2019-09-01???2019-09-01?14:30:18
          2019-09-04???2019-09-04?20:43:03
          2019-09-05???2019-09-05?05:18:28
          Name:?DT,?Length:?5450,?dtype:?datetime64[ns]


          2.自定義格式。


          #?自定義時間格式,%F默認為yyyy-mm-ddIn [34]:pd.to_datetime(df2['DT'],format='%Y/%m/%d?%H:%M:%S').dt.strftime('%F')


          Out[34]:

          TR_DT
          2000-01-01????2000-01-01
          2000-01-05????2000-01-05
          2000-01-05????2000-01-05
          2000-01-08????2000-01-08
          2000-01-10????2000-01-10
          2000-01-11????2000-01-11
          2000-01-13????2000-01-13
          2000-01-13????2000-01-13
          2000-01-14????2000-01-14
          2000-01-15????2000-01-15
          2000-01-20????2000-01-20
          2000-01-20????2000-01-20
          2000-01-21????2000-01-21
          2000-01-22????2000-01-22
          2000-01-22????2000-01-22
          2000-01-23????2000-01-23
          2000-01-23????2000-01-23
          2000-01-25????2000-01-25
          2000-01-31????2000-01-31
          2000-01-31????2000-01-31
          2000-01-31????2000-01-31
          2000-02-01????2000-02-01
          2000-02-02????2000-02-02
          2000-02-03????2000-02-03
          2000-02-04????2000-02-04
          2000-02-04????2000-02-04
          2000-02-07????2000-02-07
          2000-02-08????2000-02-08
          2000-02-09????2000-02-09
          2000-02-09????2000-02-09
          ?????????????????...????
          2019-08-01????2019-08-01
          2019-08-01????2019-08-01
          2019-08-02????2019-08-02
          2019-08-03????2019-08-03
          2019-08-04????2019-08-04
          2019-08-04????2019-08-04
          2019-08-04????2019-08-04
          2019-08-07????2019-08-07
          2019-08-09????2019-08-09
          2019-08-10????2019-08-10
          2019-08-12????2019-08-12
          2019-08-13????2019-08-13
          2019-08-14????2019-08-14
          2019-08-16????2019-08-16
          2019-08-20????2019-08-20
          2019-08-22????2019-08-22
          2019-08-22????2019-08-22
          2019-08-23????2019-08-23
          2019-08-24????2019-08-24
          2019-08-24????2019-08-24
          2019-08-26????2019-08-26
          2019-08-28????2019-08-28
          2019-08-29????2019-08-29
          2019-08-29????2019-08-29
          2019-08-29????2019-08-29
          2019-08-31????2019-08-31
          2019-09-01????2019-09-01
          2019-09-01????2019-09-01
          2019-09-04????2019-09-04
          2019-09-05????2019-09-05
          Name:?DT,?Length:?5450,?dtype:?object


          方法二:分割日期、時間實現(xiàn)


          1.普通分割日期、時間。


          In [35]:df2['DT'].str.split(' ')


          Out[35]:

          TR_DT
          2000-01-01?????[2000/1/1,?13:37:41]
          2000-01-05??????[2000/1/5,?2:41:28]
          2000-01-05?????[2000/1/5,?20:11:05]
          2000-01-08?????[2000/1/8,?12:24:21]
          2000-01-10????[2000/1/10,?13:47:24]
          2000-01-11?????[2000/1/11,?9:57:18]
          2000-01-13?????[2000/1/13,?5:14:15]
          2000-01-13????[2000/1/13,?15:11:01]
          2000-01-14?????[2000/1/14,?3:14:46]
          2000-01-15?????[2000/1/15,?1:33:14]
          2000-01-20????[2000/1/20,?17:23:17]
          2000-01-20????[2000/1/20,?16:03:59]
          2000-01-21????[2000/1/21,?19:21:49]
          2000-01-22?????[2000/1/22,?9:37:04]
          2000-01-22?????[2000/1/22,?2:32:30]
          2000-01-23?????[2000/1/23,?1:04:06]
          2000-01-23????[2000/1/23,?21:27:52]
          2000-01-25?????[2000/1/25,?9:45:24]
          2000-01-31????[2000/1/31,?10:07:12]
          2000-01-31????[2000/1/31,?18:38:42]
          2000-01-31????[2000/1/31,?12:15:48]
          2000-02-01??????[2000/2/1,?0:19:37]
          2000-02-02?????[2000/2/2,?11:06:32]
          2000-02-03?????[2000/2/3,?20:01:44]
          2000-02-04??????[2000/2/4,?1:14:56]
          2000-02-04?????[2000/2/4,?15:09:53]
          2000-02-07??????[2000/2/7,?4:01:30]
          2000-02-08?????[2000/2/8,?15:10:03]
          2000-02-09?????[2000/2/9,?18:34:20]
          2000-02-09?????[2000/2/9,?12:45:29]
          ??????????????????????...??????????
          2019-08-01??????[2019/8/1,?2:01:35]
          2019-08-01??????[2019/8/1,?6:00:56]
          2019-08-02?????[2019/8/2,?10:33:52]
          2019-08-03?????[2019/8/3,?21:39:52]
          2019-08-04?????[2019/8/4,?14:57:07]
          2019-08-04?????[2019/8/4,?14:49:10]
          2019-08-04?????[2019/8/4,?14:33:55]
          2019-08-07?????[2019/8/7,?12:59:10]
          2019-08-09??????[2019/8/9,?2:49:38]
          2019-08-10?????[2019/8/10,?2:24:42]
          2019-08-12????[2019/8/12,?15:29:54]
          2019-08-13?????[2019/8/13,?7:50:24]
          2019-08-14????[2019/8/14,?13:37:55]
          2019-08-16?????[2019/8/16,?7:31:32]
          2019-08-20????[2019/8/20,?18:15:54]
          2019-08-22????[2019/8/22,?20:50:28]
          2019-08-22?????[2019/8/22,?2:14:05]
          2019-08-23????[2019/8/23,?22:38:41]
          2019-08-24????[2019/8/24,?11:27:21]
          2019-08-24????[2019/8/24,?15:08:09]
          2019-08-26????[2019/8/26,?10:10:34]
          2019-08-28?????[2019/8/28,?5:55:17]
          2019-08-29????[2019/8/29,?13:01:15]
          2019-08-29????[2019/8/29,?15:49:22]
          2019-08-29????[2019/8/29,?10:07:08]
          2019-08-31?????[2019/8/31,?1:22:46]
          2019-09-01?????[2019/9/1,?13:13:30]
          2019-09-01?????[2019/9/1,?14:30:18]
          2019-09-04?????[2019/9/4,?20:43:03]
          2019-09-05??????[2019/9/5,?5:18:28]
          Name:?DT,?Length:?5450,?dtype:?object


          2.實現(xiàn)分割日期、時間。


          In [36]:df2['DT'].str.split(' ',expand = True)[0]


          Out[36]:

          TR_DT
          2000-01-01?????2000/1/1
          2000-01-05?????2000/1/5
          2000-01-05?????2000/1/5
          2000-01-08?????2000/1/8
          2000-01-10????2000/1/10
          2000-01-11????2000/1/11
          2000-01-13????2000/1/13
          2000-01-13????2000/1/13
          2000-01-14????2000/1/14
          2000-01-15????2000/1/15
          2000-01-20????2000/1/20
          2000-01-20????2000/1/20
          2000-01-21????2000/1/21
          2000-01-22????2000/1/22
          2000-01-22????2000/1/22
          2000-01-23????2000/1/23
          2000-01-23????2000/1/23
          2000-01-25????2000/1/25
          2000-01-31????2000/1/31
          2000-01-31????2000/1/31
          2000-01-31????2000/1/31
          2000-02-01?????2000/2/1
          2000-02-02?????2000/2/2
          2000-02-03?????2000/2/3
          2000-02-04?????2000/2/4
          2000-02-04?????2000/2/4
          2000-02-07?????2000/2/7
          2000-02-08?????2000/2/8
          2000-02-09?????2000/2/9
          2000-02-09?????2000/2/9
          ????????????????...????
          2019-08-01?????2019/8/1
          2019-08-01?????2019/8/1
          2019-08-02?????2019/8/2
          2019-08-03?????2019/8/3
          2019-08-04?????2019/8/4
          2019-08-04?????2019/8/4
          2019-08-04?????2019/8/4
          2019-08-07?????2019/8/7
          2019-08-09?????2019/8/9
          2019-08-10????2019/8/10
          2019-08-12????2019/8/12
          2019-08-13????2019/8/13
          2019-08-14????2019/8/14
          2019-08-16????2019/8/16
          2019-08-20????2019/8/20
          2019-08-22????2019/8/22
          2019-08-22????2019/8/22
          2019-08-23????2019/8/23
          2019-08-24????2019/8/24
          2019-08-24????2019/8/24
          2019-08-26????2019/8/26
          2019-08-28????2019/8/28
          2019-08-29????2019/8/29
          2019-08-29????2019/8/29
          2019-08-29????2019/8/29
          2019-08-31????2019/8/31
          2019-09-01?????2019/9/1
          2019-09-01?????2019/9/1
          2019-09-04?????2019/9/4
          2019-09-05?????2019/9/5
          Name:?0,?Length:?5450,?dtype:?object


          3.實現(xiàn)自定義格式。


          In [37]:df2['DT'].str.split(' ',expand = True)[0].str.replace('/','-')


          Out[37]:

          TR_DT
          2000-01-01?????2000-1-1
          2000-01-05?????2000-1-5
          2000-01-05?????2000-1-5
          2000-01-08?????2000-1-8
          2000-01-10????2000-1-10
          2000-01-11????2000-1-11
          2000-01-13????2000-1-13
          2000-01-13????2000-1-13
          2000-01-14????2000-1-14
          2000-01-15????2000-1-15
          2000-01-20????2000-1-20
          2000-01-20????2000-1-20
          2000-01-21????2000-1-21
          2000-01-22????2000-1-22
          2000-01-22????2000-1-22
          2000-01-23????2000-1-23
          2000-01-23????2000-1-23
          2000-01-25????2000-1-25
          2000-01-31????2000-1-31
          2000-01-31????2000-1-31
          2000-01-31????2000-1-31
          2000-02-01?????2000-2-1
          2000-02-02?????2000-2-2
          2000-02-03?????2000-2-3
          2000-02-04?????2000-2-4
          2000-02-04?????2000-2-4
          2000-02-07?????2000-2-7
          2000-02-08?????2000-2-8
          2000-02-09?????2000-2-9
          2000-02-09?????2000-2-9
          ????????????????...????
          2019-08-01?????2019-8-1
          2019-08-01?????2019-8-1
          2019-08-02?????2019-8-2
          2019-08-03?????2019-8-3
          2019-08-04?????2019-8-4
          2019-08-04?????2019-8-4
          2019-08-04?????2019-8-4
          2019-08-07?????2019-8-7
          2019-08-09?????2019-8-9
          2019-08-10????2019-8-10
          2019-08-12????2019-8-12
          2019-08-13????2019-8-13
          2019-08-14????2019-8-14
          2019-08-16????2019-8-16
          2019-08-20????2019-8-20
          2019-08-22????2019-8-22
          2019-08-22????2019-8-22
          2019-08-23????2019-8-23
          2019-08-24????2019-8-24
          2019-08-24????2019-8-24
          2019-08-26????2019-8-26
          2019-08-28????2019-8-28
          2019-08-29????2019-8-29
          2019-08-29????2019-8-29
          2019-08-29????2019-8-29
          2019-08-31????2019-8-31
          2019-09-01?????2019-9-1
          2019-09-01?????2019-9-1
          2019-09-04?????2019-9-4
          2019-09-05?????2019-9-5
          Name:?0,?Length:?5450,?dtype:?object


          結(jié)束語


          日期、時間數(shù)據(jù)處理,說實話是數(shù)據(jù)處理中比較難的一部分,處理不好的話,各種問題。尤其是本文提到的時間差、時間增量等問題。目前,我還沒有發(fā)現(xiàn)Python比SQL處理時間更簡單的庫。如果有,煩請各位告知一二。



          ———— # END # ————


          # # #

          ↓ ↓ ↓


          不再糾結(jié),一文詳解pandas中的map、apply、applymap、groupby、agg...


          還在用Excel的vlookup?Python幾行代碼就能搞定!


          一場pandas與SQL的巔峰大戰(zhàn)


          一場pandas與SQL的巔峰大戰(zhàn)(二)



          在看”的永遠18歲~
          瀏覽 24
          點贊
          評論
          收藏
          分享

          手機掃一掃分享

          分享
          舉報
          評論
          圖片
          表情
          推薦
          點贊
          評論
          收藏
          分享

          手機掃一掃分享

          分享
          舉報
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  国产99热 | 日本18禁黄网站 | 欧洲精品一区二区三区 | 4438全国成人网 | 蜜桃av噜噜一区二区三区麻豆 |