I am currently utilizing the XGBoost classifier within a pipeline that includes normalization and the XGBoost model itself. The model has been successfully developed in the Notebook environment.
The dataset comprises only two types of features: float and integer; there are no datetime features present during model creation.
While building the model using Vertex AI, it generates and stores the model in a specified location. However, after storage, an error occurs, resulting in a failure of the entire Vertex AI pipeline. The specific error message following model creation is as follows:
TypeError: Object of type Timestamp is not JSON serializable
I have attempted several techniques to adjust the data types prior to model creation; however, I continue to encounter the same issue.
Code:
# Define
pipeline = Pipeline(steps=[("preprocessor", ct), ("classifier", model)])
# Fit the Model
final_model = pipeline.fit(X_train, y_train)
# Saving the Model
joblib.dump(final_model, "final_model.pkl")
Error:
File "/usr/local/lib/python3.10/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/usr/local/lib/python3.10/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type Timestamp is not JSON serializable
Timestampis a (so-called) Well-known Type in Protobuf. Because it's a non-trivial class, it isn't JSON serializable directly (i.e. you can'tjson.dumps(timestamp)). However, there are Protobuf methods to JSON serialize Protobuf types (i.e.MessageToJson(timestamp)). If you have the ability to post-process the model, you should be able to serialize the Timestamps this way.