0

I am currently utilizing the XGBoost classifier within a pipeline that includes normalization and the XGBoost model itself. The model has been successfully developed in the Notebook environment.

The dataset comprises only two types of features: float and integer; there are no datetime features present during model creation.

While building the model using Vertex AI, it generates and stores the model in a specified location. However, after storage, an error occurs, resulting in a failure of the entire Vertex AI pipeline. The specific error message following model creation is as follows:

TypeError: Object of type Timestamp is not JSON serializable

I have attempted several techniques to adjust the data types prior to model creation; however, I continue to encounter the same issue.

Code:

#  Define
pipeline = Pipeline(steps=[("preprocessor", ct), ("classifier", model)])

# Fit the Model
final_model = pipeline.fit(X_train, y_train)

# Saving the Model
joblib.dump(final_model, "final_model.pkl")

Error:

File "/usr/local/lib/python3.10/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/usr/local/lib/python3.10/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type Timestamp is not JSON serializable
3
  • Drive-by comment: Unfamiliar with Vertex AI, but it appears to be using Protobuf to store the model. Timestamp is a (so-called) Well-known Type in Protobuf. Because it's a non-trivial class, it isn't JSON serializable directly (i.e. you can't json.dumps(timestamp)). However, there are Protobuf methods to JSON serialize Protobuf types (i.e. MessageToJson(timestamp)). If you have the ability to post-process the model, you should be able to serialize the Timestamps this way. Commented Nov 1, 2024 at 19:27
  • Hi @DazWilkin, Thank you for your feedback. This model currently doesn’t include features with timestamps. I am able to produce the model in Jupyter Notebook within the same environment and haven’t encountered any timestamp-related issues there. Any insights or suggestions you might have regarding this would be greatly appreciated. Commented Nov 3, 2024 at 23:17
  • I suspect that when Vertex interacts with your model, it's adding Timestamps as metadata. I'm unfamiliar with the process and Vertex so I'm unable to help. Commented Nov 4, 2024 at 4:05

1 Answer 1

0

The error message you're encountering, TypeError: Object of type Timestamp is not JSON serializable, indicates that you're trying to convert a pandas DataFrame containing a Timestamp column to JSON format, but the encoder cannot handle the Timestamp objects directly.

import pandas as pd 

# Sample DataFrame with a Timestamp column df = pd.DataFrame({'timestamp': pd.to_datetime(['2023-11-22 12:34:56', '2023-12-01 09:00:00'])}) 

# Convert Timestamps to strings in a desired format (e.g., ISO 8601) df['timestamp_str'] = df['timestamp'].dt.strftime('%Y-%m-%dT%H:%M:%S.%fZ') 

# Now, you can serialize the DataFrame to JSON without encountering the error: json_data = df.to_json(orient='records') 

print(json_data)

By converting the Timestamps to strings, the DataFrame can now be serialized to JSON without any errors. Vertex AI or any other JSON consumer will be able to interpret the timestamps in the desired format.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.