df = pd.DataFrame({
'col_str': ["a", "b", "c"],
'col_lst_str': [["a", "b", "c"], ["d", "e", "f"], ["g", "h", "i"]],
'col_lst_int': [[1, 2, 3], [4, 5, 6], [7, 8, 9]],
'col_arr_int': [np.array([1, 2, 3]),np.array([4, 5, 6]), np.array([7, 8, 9])]
})
print(df.dtypes)
print(pd.api.types.is_object_dtype(df['col_lst_int'].dtype)) # return True expected !
print(pd.api.types.is_object_dtype(df['col_arr_int'].dtype)) # return True expected !
print(pd.api.types.is_string_dtype(df['col_lst_int'].dtype)) # return True confusing !!
print(pd.api.types.is_string_dtype(df['col_arr_int'].dtype)) # return True confusing !!
print(df['col_lst_int'].apply(lambda x: isinstance(x, list)).all()) # return True expected !
print(df['col_arr_int'].apply(lambda x: isinstance(x, np.ndarray)).all()) # return True expected !
When a pandas dataframe column contains lists or numpy arrays of integer elements (column dtype=object) both pd.api.types.is_object_dtype() and pd.api.types.is_string_dtype() return True which is completely misleading. I was expecting that pd.api.types.is_string_dtype() will return False. Now my column is seems to have two valid dtypes, dtype = object and dtype = string which can cause serious problemes in conditionnal logics. Even the API doc official is misleading claiming that the element must be inferred as string. How come elements 1 2 3 can be infered as string in my example ? It seems to works as expected with pandas Series though , Is it a bug with dataframes ?

df['col_lst_int']=>dtype: objectnot int, andis_string_dtype(object)=> True. Butis_string_dtype(pd.Series([1, 2]))is False. Why did you expect a list of int is typed int ?is_string_dtype(object)=> Trueis a bug because this is explicitly said in doc. And how to imagine a list (of int or whatever) is not an object ?df2 = pd.DataFrame({'col_int':[1,2,3]})thendf2['col_int'].dtype=> int64, andpd.api.types.is_string_dtype(df2['col_int'])=> False