0

I have a string as below, which needs to be split to an array, using VB.NET

10,"Test, t1",10.1,,,"123"

The result array must have 6 rows as below

10
Test, t1
10.1
(empty)
(empty)
123

So: 1. quotes around strings must be removed 2. comma can be inside strings, and will remain there (row 2 in result array) 3. can have empty fields (comma after comma in source string, with nothing in between)

Thanks

4 Answers 4

4

Don't use String.Split(): it's slow, and doesn't account for a number of possible edge cases.

Don't use RegEx. RegEx can be shoe-horned to do this accurately, but to correctly account for all the cases the expression tends to be very complicated, hard to maintain, and at this point isn't much faster than the .Split() option.

Do use a dedicated CSV parser. Options include the Microsoft.VisualBasic.TextFieldParser type, FastCSV, linq-to-csv, and a parser I wrote for another answer.

Sign up to request clarification or add additional context in comments.

1 Comment

How about a coding example? :)
1

You can write a function yourself. This should do the trick:

Dim values as New List(Of String)
Dim currentValueIsString as Boolean
Dim valueSeparator as Char = ","c
Dim currentValue as String = String.Empty

For Each c as Char in inputString
   If c = """"c Then
     If currentValueIsString Then
        currentValueIsString = False
     Else 
        currentValueIsString = True
     End If
   End If

   If c = valueSeparator Andalso not currentValueIsString Then
     If String.IsNullOrEmpty(currentValue) Then currentValue = "(empty)"
     values.Add(currentValue)
     currentValue = String.Empty
   End If

   currentValue += c
Next

Comments

1

Here's another simple way that loops by the delimiter instead of by character:

Public Function Parser(ByVal ParseString As String) As List(Of String)
    Dim Trimmer() As Char = {Chr(34), Chr(44)}
    Parser = New List(Of String)
    While ParseString.Length > 1
        Dim TempString As String = ""
        If ParseString.StartsWith(Trimmer(0)) Then
            ParseString = ParseString.TrimStart(Trimmer)
            Parser.Add(ParseString.Substring(0, ParseString.IndexOf(Trimmer(0))))
            ParseString = ParseString.Substring(Parser.Last.Length)
            ParseString = ParseString.TrimStart(Trimmer)
        ElseIf ParseString.StartsWith(Trimmer(1)) Then
            Parser.Add("")
            ParseString = ParseString.Substring(1)
        Else
            Parser.Add(ParseString.Substring(0, ParseString.IndexOf(Trimmer(1))))
            ParseString = ParseString.Substring(ParseString.IndexOf(Trimmer(1)) + 1)
        End If
    End While
End Function

This returns a list. If you must have an array just use the ToArray method when you call the function

Comments

0

Why not just use the split method?

Dim s as String = "10,\"Test, t1\",10.1,,,\"123\""
s = s.Replace("\"","")
Dim arr as String[] = s.Split(',')

My VB is rusty so consider this pseudo-code

2 Comments

won't work because it will split also the the second "field", "Test, T1", which I don't want
It's a start, you definitely need to take care of edge cases yourself or use a library like the other answer suggested.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.