I have some very ugly data I am trying to massage. It consist of SKUs and I want to group them into product line. E.g.:
PRODUCT_ID
----------
313L30WHITE
313L40WHITE
313L30BLACK
3333
2L10RED
2L20BLACK
32341/30/BLK
Basically, I want to group items by the first numeric characters in the PRODUCT_ID field. I.e., all the characters up to the first non-numeric character. E.g.:
PRODUCT_ID GROUP
---------- -----
313L30WHITE 313
313L40WHITE 313
313L30BLACK 313
3333 3333
2L10RED 2
2L20BLACK 2
32341/30/BLK 32341
Seems like a SQL solution would not be elegant. Because of that, I would prefer a Python solution that creates a new table with a new GROUP column.
Anyone have any suggestions?