I am trying to find and plot the hourly error rate in a custom log file that looks like this.
<2019-12-19T16:02:14.776+0000> WARNING <clientToServer.smaap> Response NOT OK --Transport content: <html><head><title>500 Internal Server Error</title></head><body bgcolor="white"><center><h1>500 Internal Server Error</h1></center><hr><center> </center></body></html>
--ServiceInfo: PostDataService - client: A201ACCDC3DAB47C0B4BF021D11785DFD49F1863 ,tenant: 19b0be25fd5248588f0631a820a43c88 ,payloadType: apm_metric ,messageForClient: false Observations: DeploymentMetric[1] MappingMetric[1] RequestTypeMetric[1] LinkMetric[1] --Transport info: HTTP method: POST ,URL: https://oc-19b0be25fd5248588f0631a820a43c88.api.smouloud.com/static/data.storage/m_metric ,response status: 500 ,response headers: Connection=keep-alive , Content-Length=182 , Date=Thu, 19 Dec 2019 16:02:14 GMT , Content-Type=text/html <Ref: WKWLUTVWBDJ2GAGOIRZL5VHMK3ZTJIWX>
<2019-12-19T16:04:12.242+0000> WARNING <ACTION.JCB> <e646de45-4c09-4a6e-a5d1-3552db8fc0dc-0000000b> Failed to get business interface of sdpinternal.messaging.management.em.ServerTargetImpl, class will not be monitored. <Ref: N2IOE3DZNWKAYSYLJTN5AWRQJP7DKMTS>
<2019-12-19T16:04:14.745+0000> WARNING <clientToServer.smaap> Response NOT OK --Transport content: <html><head><title>500 Internal Server Error</title></head><body bgcolor="white"><center><h1>500 Internal Server Error</h1></center><hr><center> </center></body></html>
--ServiceInfo: PostDataService - client: A201ACCDC3DAB47C0B4BF021D11785DFD49F1863 ,tenant: 19b0be25fd5248588f0631a820a43c88 ,payloadType: apm_metric ,messageForClient: false Observations: HostMetric[1] DeploymentMetric[1] JVMMetric[1] InfrastructureMetric[1] MappingMetric[1] RequestTypeMetric[1] LinkMetric[1] ThreadPoolMetric[1] AppServerMetric[1] ConnectionPoolMetric[1] --Transport info: HTTP method: POST ,URL: https://omc-19b0be25fd5248588f0631a820a43c88.api.omc.ocp.oraclecloud.com/static/data.storage/apm_metric ,response status: 500 ,response headers: Connection=keep-alive , Content-Length=182 , Date=Thu, 19 Dec 2019 16:04:14 GMT , Content-Type=text/html <Ref: PU5HERXNVLSVAG33LIOPJVKYSZJC4R73>
<2019-12-19T16:04:14.753+0000> WARNING <clientToServer.transport> Error connecting to https://oc-19b0be25fd5248588f0631a820a43c88.api.smouloud.com/static/data.storage/m_metric <Ref: NL6XDJAALZ23BM4PPRRFWRFFBC6KLYSE>
I would like to plot the number of "500 Internal Server Error" in every hour. I tried to parse this log into a pandas dataframe using the following:
import pandas as pd
from pandas.compat import StringIO
tmp=u"""<2019-12-19T16:02:14.776+0000> WARNING <clientToServer.smaap> Response NOT OK --Transport content: <html><head><title>500 Internal Server Error</title></head><body bgcolor="white"><center><h1>500 Internal Server Error</h1></center><hr><center> </center></body></html>
--ServiceInfo: PostDataService - client: A201ACCDC3DAB47C0B4BF021D11785DFD49F1863 ,tenant: 19b0be25fd5248588f0631a820a43c88 ,payloadType: apm_metric ,messageForClient: false Observations: DeploymentMetric[1] MappingMetric[1] RequestTypeMetric[1] LinkMetric[1] --Transport info: HTTP method: POST ,URL: https://oc-19b0be25fd5248588f0631a820a43c88.api.smouloud.com/static/data.storage/m_metric ,response status: 500 ,response headers: Connection=keep-alive , Content-Length=182 , Date=Thu, 19 Dec 2019 16:02:14 GMT , Content-Type=text/html <Ref: WKWLUTVWBDJ2GAGOIRZL5VHMK3ZTJIWX>
<2019-12-19T16:04:12.242+0000> WARNING <ACTION.JCB> <e646de45-4c09-4a6e-a5d1-3552db8fc0dc-0000000b> Failed to get business interface of sdpinternal.messaging.management.em.ServerTargetImpl, class will not be monitored. <Ref: N2IOE3DZNWKAYSYLJTN5AWRQJP7DKMTS>
<2019-12-19T16:04:14.745+0000> WARNING <clientToServer.smaap> Response NOT OK --Transport content: <html><head><title>500 Internal Server Error</title></head><body bgcolor="white"><center><h1>500 Internal Server Error</h1></center><hr><center> </center></body></html>
--ServiceInfo: PostDataService - client: A201ACCDC3DAB47C0B4BF021D11785DFD49F1863 ,tenant: 19b0be25fd5248588f0631a820a43c88 ,payloadType: apm_metric ,messageForClient: false Observations: HostMetric[1] DeploymentMetric[1] JVMMetric[1] InfrastructureMetric[1] MappingMetric[1] RequestTypeMetric[1] LinkMetric[1] ThreadPoolMetric[1] AppServerMetric[1] ConnectionPoolMetric[1] --Transport info: HTTP method: POST ,URL: https://omc-19b0be25fd5248588f0631a820a43c88.api.omc.ocp.oraclecloud.com/static/data.storage/apm_metric ,response status: 500 ,response headers: Connection=keep-alive , Content-Length=182 , Date=Thu, 19 Dec 2019 16:04:14 GMT , Content-Type=text/html <Ref: PU5HERXNVLSVAG33LIOPJVKYSZJC4R73>
<2019-12-19T16:04:14.753+0000> WARNING <clientToServer.transport> Error connecting to https://oc-19b0be25fd5248588f0631a820a43c88.api.smouloud.com/static/data.storage/m_metric <Ref: NL6XDJAALZ23BM4PPRRFWRFFBC6KLYSE>"""
df = pd.read_csv(StringIO(tmp), comment=' --', sep='0> ', names=['Time','Text'])
indexNames = df[ (df['Time'].str.startswith(' --')) ].index
df.drop(indexNames , inplace=True)
# remove < by strip and convert column Time to_datetime:
df.Time = pd.to_datetime(df.Time.str.strip('<'), format='%Y-%m-%dT%H:%M:%S.%f+0000')
df.Text = df.Text.str.strip()
print (df)
print (df.dtypes)
For some reason I am unable to remove rows from the dataframe.
I am using pandas 0.24.2 with Python 3.7.3 Any ideas?