1

I want to split mysql queries from phpmyadmin export format. Each query will be separated by ";" but can't use explode because ";" could be there anywhere in field value too. Below is one example.

insert into wp_options (option_id, blog_id, option_name, option_value, autoload) values (201, 0, "tadv_btns2", "a:21:{i:0;s:14:\"fontsizeselect\";i:1;s:12:\"formatselect\";i:2;s:9:\"pastetext\";i:3;s:9:\"pasteword\";i:4;s:12:\"removeformat\";i:5;s:9:\"separator\";i:6;s:7:\"charmap\";i:7;s:5:\"print\";i:8;s:9:\"separator\";i:9;s:9:\"forecolor\";i:10;s:9:\"backcolor\";i:11;s:8:\"emotions\";i:12;s:9:\"separator\";i:13;s:3:\"sup\";i:14;s:3:\"sub\";i:15;s:5:\"media\";i:16;s:9:\"separator\";i:17;s:4:\"undo\";i:18;s:4:\"redo\";i:19;s:7:\"attribs\";i:20;s:7:\"wp_help\";}", "no");

insert into wp_options (option_id, blog_id, option_name, option_value, autoload) values (202, 0, "tadv_btns3", "a:0:{}", "no");

insert into wp_options (option_id, blog_id, option_name, option_value, autoload) values (203, 0, "tadv_btns4", "a:0:{}", "no");

insert into wp_options (option_id, blog_id, option_name, option_value, autoload) values (204, 0, "tadv_allbtns", "a:64:{i:0;s:6:\"wp_adv\";i:1;s:4:\"bold\";i:2;s:6:\"italic\";i:3;s:13:\"strikethrough\";i:4;s:9:\"underline\";i:5;s:7:\"bullist\";i:6;s:7:\"numlist\";i:7;s:7:\"outdent\";i:8;s:6:\"indent\";i:9;s:11:\"justifyleft\";i:10;s:13:\"justifycenter\";i:11;s:12:\"justifyright\";i:12;s:11:\"justifyfull\";i:13;s:3:\"cut\";i:14;s:4:\"copy\";i:15;s:5:\"paste\";i:16;s:4:\"link\";i:17;s:6:\"unlink\";i:18;s:5:\"image\";i:19;s:7:\"wp_more\";i:20;s:7:\"wp_page\";i:21;s:6:\"search\";i:22;s:7:\"replace\";i:23;s:10:\"fontselect\";i:24;s:14:\"fontsizeselect\";i:25;s:7:\"wp_help\";i:26;s:10:\"fullscreen\";i:27;s:11:\"styleselect\";i:28;s:12:\"formatselect\";i:29;s:9:\"forecolor\";i:30;s:9:\"backcolor\";i:31;s:9:\"pastetext\";i:32;s:9:\"pasteword\";i:33;s:12:\"removeformat\";i:34;s:7:\"cleanup\";i:35;s:12:\"spellchecker\";i:36;s:7:\"charmap\";i:37;s:5:\"print\";i:38;s:4:\"undo\";i:39;s:4:\"redo\";i:40;s:13:\"tablecontrols\";i:41;s:4:\"cite\";i:42;s:3:\"ins\";i:43;s:3:\"del\";i:44;s:4:\"abbr\";i:45;s:7:\"acronym\";i:46;s:7:\"attribs\";i:47;s:5:\"layer\";i:48;s:5:\"advhr\";i:49;s:4:\"code\";i:50;s:11:\"visualchars\";i:51;s:11:\"nonbreaking\";i:52;s:3:\"sub\";i:53;s:3:\"sup\";i:54;s:9:\"visualaid\";i:55;s:10:\"insertdate\";i:56;s:10:\"inserttime\";i:57;s:6:\"anchor\";i:58;s:10:\"styleprops\";i:59;s:8:\"emotions\";i:60;s:5:\"media\";i:61;s:10:\"blockquote\";i:62;s:9:\"separator\";i:63;s:1:\"|\";}", "no");
1

6 Answers 6

8

Given a string containing multiple, semi-colon separated SQL statements, the following function parses out each individual statement and returns them all in an array. It uses one (non-trivial) regex, and one call to preg_match_all(), and correctly handles single and multi-line comments, and single and double-quoted strings, (each of which may contain non-terminating semi-colons to be ignored):

function split_sql($sql_text) {
    // Return array of ; terminated SQL statements in $sql_text.
    $re_split_sql = '%(?#!php/x re_split_sql Rev:20170816_0600)
        # Match an SQL record ending with ";"
        \s*                                     # Discard leading whitespace.
        (                                       # $1: Trimmed non-empty SQL record.
          (?:                                   # Group for content alternatives.
            \'[^\'\\\\]*(?:\\\\.[^\'\\\\]*)*\'  # Either a single quoted string,
          | "[^"\\\\]*(?:\\\\.[^"\\\\]*)*"      # or a double quoted string,
          | /\*[^*]*\*+(?:[^*/][^*]*\*+)*/      # or a multi-line comment,
          | \#.*                                # or a # single line comment,
          | --.*                                # or a -- single line comment,
          | [^"\';#]                            # or one non-["\';#-]
          )+                                    # One or more content alternatives
          (?:;|$)                               # Record end is a ; or string end.
        )                                       # End $1: Trimmed SQL record.
        %x';  // End $re_split_sql.
    if (preg_match_all($re_split_sql, $sql_text, $matches)) {
        return $matches[1];
    }
    return array();
}

Edit 2017-08-15: Fixed error in multiline comment portion of regex pointed out by @jxmallett. Edit 2017-08-16: Tidied regex (added regex shebang and removed unnecessary group $2).

Sign up to request clarification or add additional context in comments.

10 Comments

This is great, only wish it could separate comments into their own index instead of grouping them with the following sql statement. Not sure how to do that with multi-line comments and without something hacky like checking for keywords that are only valid at the start of a sql statement... :(
For now I'm just stripping out all comments with this answer: stackoverflow.com/questions/9690448/… and then using split_sql() to break out the result.
does not support PROCEDURES.
and the same regexp for JS: regex = '\s*((?:\'[^\'\\\\]*(?:\\\\.[^\'\\\\]*)*\'|"[^"\\\\]*(?:\\\\.[^"\\\\]*)*"|\#.*|\\/\\*[\\w\\W]*?(?=\\*\\/)\\*\\/|--.*|[^"\';#])+(?:;|$))';
It matches this as one query when it's actually 2: /* comment */ SELECT * FROM foo; /* comment 2 */ SELECT * FROM bar; It seems to be a problem with multi-line comments. I used '@(([\'"]).*?[^\\\]\2)|((?:\#|--).*?$|/\*(?:[^/*]|/(?!\*)|\*(?!/)|(?R))*\*\/)\s*|(?<=;)\s+@ms' to strip all the comments before running split_sql.
|
1

one way to do is :

explode by ");"

it will return array of statements

then concat ")" at the end of each statement in array

Comments

1

My suggestion would be to replace your delimiter with something that would be unique. For example, a semicolon won't be unique (as you described) so you can't split on this. Newlines (as was stated before) may be a good and simple option.

Comments

0

Are those lines separated by a "newline" character? If yes just explode them:

$my_array = explode("\n",$content);

1 Comment

actually some queries are divided in more than one line so explode("\n",$content) won't work
0

how about something like this

$fh = fopen('/path/to/your/file.sql', 'r') or die($php_errormsg);
while (!feof($fh)) {
    $line = fgets($fh, 4096);
    if (preg_match($pattern, $line)) { $sql_queries[ ] = $line; }
}
fclose($fh);

Where $pattern can be initialized to an sql insert query format. Using newline wont be a good option as large sql inserts stretch for more than one line

Comments

0

I modified "ridgerunner" function "split_sql" for some problems: no back quoted strings, was skipping queries with syntax errors, not all multi-line comments matches or last empty sql. It always return full array for debug each query or reassembling for profiling in MySQL;

function split_sql($sql, $rtrim = TRUE)
{
    $re = '%
        \s*                                                 # Discard leading whitespace.
            ((?:                                            # Group for content alternatives.
              [^;"\'#`\/-]+                                 # None of these [...]                  
            | \'[^\'\\\\]*(?:\\\\.[^\'\\\\]*)*\'?           # or a single quoted string,
            | "[^"\\\\]*(?:\\\\.[^"\\\\]*)*"?               # or a double quoted string,
            | `[^`]*`?                                      # single back quoted string
            | \/\*(?:.*?\*\/|.*)                            # or a multi-line comment,
            | (?-s)\#.*                                     # or a # single line comment (off s-modifier from here),
            | --.*                                          # or a -- single line comment,
            | -                                             # or -
            )*)                                             # One or more content alternatives
            ;                                               # Record end is a ; or string end.
        %xs';
    if ($rtrim)
        $sql = rtrim($sql, "; \t\n\r\0\x0B");
    if (preg_match_all($re, $sql.';', $matches))
        return $matches[1];        
    return [];
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.