0

I have written a PHP class that gets the headers from a .xls spreadsheet and creates a table with those headers as column names.

It also gets each row of data from the spreadsheet and places them into an array.

What I would then like to do, is determine the best data type for each column in the row. It's mostly going to be text but there will be numbers in there, say for example £1,000 this would need to be saved as 1000 and be a int rather than a string.

It needs to be done dynamically as each spreadsheet has different column names and data in different orders.

I don't really know how to go about this, I was thinking maybe a for each loop and preg_match?

Any ideas are very much appreciated.

1

2 Answers 2

1

I think that you need to check all data in column to determine if there is some not numeric values ( http://ru.php.net/manual/en/function.is-numeric.php ). If there is no not numeric data, you may use INT/TINYINT/MEDIUMINT type with appropriate length. If there is not only numeric data, you may use CHAR/VARCHAR/BLOG/TEXT with appropriate length.

Sign up to request clarification or add additional context in comments.

Comments

0

The way I would go about it is to define how precise I wanted to be. For example, if I find a column with 1s and 0s, will I define it as binary or should I define it as integer just in case there will be numbers different than 0 and 1 in the future.

Also are you going to parse all the rows of the spreadsheet or only a few rows at the top before deciding which data type to use? In the above example, you may have 0s and 1s at the top of the spreadsheet, but find other numbers closer to the bottom. If you decide to review only the top rows, you may want to be less strict on the data type. So if you find only 0s and 1s, you may decide to define the field as integer, not binary. This would reduce the chances for errors when importing the data.

You could use a logic somewhat like this:

for each row (and you can decide if you want to check all the rows or just a few)
if is_int() -> data field integer;
if is_float() -> data field is float;
if is_string()
    if it is a date & time -> data field is datetime;
    if it is a date without time -> data field is date;
    else -> data field is varchar.

I hope this helps. Good luck.

1 Comment

that was a great help! definitely pointed me in the right direction, thanks a lot :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.