2019年10月23日 星期三

如何在Python建立時間序列(Python Generate Datetime Series)




在寫爬蟲程式時,常會需要利用日期序列,來抓每一日的網路資料。

這時候,知道如何產生時間序列就很重要!


1.  datetime:

import datetime

begin = datetime.datetime(2019, 1, 1)
end = datetime.datetime(2019, 1, 3)
step = datetime.timedelta(days=1)

DatePeriod = []

while begin <= end:
    DatePeriod.append(begin.strftime('%Y%m%d'))
    begin += step

print(DatePeriod)
print(type(DatePeriod))

for Day in DatePeriod:
    print(Day)
print(type(Day))

2. pandas:

import pandas as pd
import numpy as np

# 設定起訖日期DatePeriod = pd.date_range("2019/1/1", "2019/1/3", freq='D')
# DatePeriod格式化成像要的格式,以及最重要的把時間由index改為string
dp1 = DatePeriod.strftime('%Y%m%d')

print(DatePeriod)
print(type(dp1))
for Day in dp1:
    print(Day)
print(type(Day))


PS:
strftime這函式很重要,可以把時間序列從index變為string!
date = dataframe.index #date is the datetime index
date = dates.strftime('%Y-%m-%d') #this will return you a numpy array, element is string.
dstr = date.tolist() #this will make you numpy array into a list


PPS:
freq 的部分除了 D 也可以換成其他參數,例如 H 是小時、M 是月份,或參考下表:
AliasDescription
Bbusiness day frequency
Ccustom business day frequency (experimental)
Dcalendar day frequency
Wweekly frequency
Mmonth end frequency
SMsemi-month end frequency (15th and end of month)
BMbusiness month end frequency
CBMcustom business month end frequency
MSmonth start frequency
SMSsemi-month start frequency (1st and 15th)
BMSbusiness month start frequency
CBMScustom business month start frequency
Qquarter end frequency
BQbusiness quarter endfrequency
QSquarter start frequency
BQSbusiness quarter start frequency
Ayear end frequency
BAbusiness year end frequency
ASyear start frequency
BASbusiness year start frequency
BHbusiness hour frequency
Hhourly frequency
T, minminutely frequency
Ssecondly frequency
L, msmilliseconds
U, usmicroseconds
Nnanoseconds
參考來源:
Python Generate Datetime Series

python+pandas+時間、日期以及時間序列處理方法

https://stackoverflow.com/questions/30132282/datetime-to-string-with-series-in-python-pandas

------------------------------------------------------------------------------------------------------
http://www.coco-in.net/forum.php?mod=viewthread&tid=38351&page=2
也可以參考別人寫的


以下說明日期字串的產生方式與迴圈
以下是粗略的代碼,請自行調整成你所要的

假設先指定開始與結束日期字串:
我習慣用西元,在後面會轉成民國年
  1. date_s = '20150801'
  2. date_e = '20150831'
複製代碼
轉換成日期型別:
  1. from datetime import date, timedelta
  2. from time import sleep

  3. date_s = date(int(date_start[:4]), int(date_start[4:6]), int(date_start[6:8]))
  4. date_e = date(int(date_end[:4]), int(date_end[4:6]), int(date_end[6:8]))
複製代碼
開始日期與結束日期間的迴圈:
  1. while date_s <= date_e:
  2.     yyyy = date_s.strftime('%Y')
  3.     mm = date_s.strftime('%m')
  4.     dd = date_s.strftime('%d')

  5.     # 產生民國“年/月/日”字串
  6.     twymd = '{}/{}/{}'.format(int(yyyy) - 1911, mm, dd)

  7.     data = {'download': 'csv', 'qdate': twymd, 'selectType': 'ALLBUT0999'}
  8.     # 下載資料,這裡省略。
  9.     # 暫停一秒
  10.     sleep(1)

  11.     # 把日期加一日   
  12.     date_s = date_s + timedelta(days=1)
  13. 複製代碼

沒有留言:

張貼留言