0

My question is similar to this but its a bit more complex and am too noob to alter the method provided there.

I have tried the substring method it can't work since the lengths of the strings can be variable.

I have a string like:

Booking:
2 people

User Details:
Firstname Lastname
123456789 
[email protected]
facebook.com/username

Extras:
Service1
Service2

Pricing:
$1500/-

Comments:
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus elementum ultricies pellentesque. Sed ullamcorper orci urna, et sagittis orci rhoncus quis.

Donec laoreet neque lectus, nec congue felis cursus non. Sed ac pulvinar nunc, vel cursus nulla. Curabitur at nisl ipsum. Etiam efficitur quam tortor, id malesuada lacus laoreet ac. Cras varius felis sem, id interdum enim accumsan et. 

I need the following values stored as variables:

var people = 2
var name   = firstname + lastname
var phone  = 123456789
var email  = [email protected]
var fbook  = facebook.com/username
var extras = Service1, Service2
var price  = $1500
var comments = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus elementum ultricies pellentesque. Sed ullamcorper orci urna, et sagittis orci rhoncus quis.
    Donec laoreet neque lectus, nec congue felis cursus non. Sed ac pulvinar nunc, vel cursus nulla. Curabitur at nisl ipsum. Etiam efficitur quam tortor, id malesuada lacus laoreet ac. Cras varius felis sem, id interdum enim accumsan et." 

Keeping in mind there can be few variables missing in some cases.. i.e. User didn't put email and/or facebook URL, so those lines can be empty or even lack empty-linebreaks.

1
  • Google "regular expressions". Commented Jul 21, 2015 at 4:04

2 Answers 2

1

If you find regex too complicated (it can be difficult to get right, even if you're not a beginner), you can use simpler javascript like this:

var input = "Booking:\n2 people\n\nUser Details:\nFirstname Lastname\n123456789\n\nfacebook.com/username\n\nExtras:\nService1\nService2\n\nPricing:\n$1500/-\n\nComments:\nLorem ipsum\n\ndolor sit amet";

// the input string is split into seperate lines and stored in array "lines":

var lines = input.split("\n");

// lines[0]="Booking:", lines[1]="2 people", lines[2]="", lines[3]="User Details" ...

// The lines are split per section, and stored in 2D-array "result":
// With expect=0 we look for sections[0], which is "Bookings".
// If the line "Bookings:" is found, "expect" is incremented to 1, so that
// we're now looking for sections[1], which is "User Details", and so on...
// If a line is found that is not the expected section title, and it's not empty,
// we add the line to the current section with push().

var sections = ["Booking", "User Details", "Extras", "Pricing", "Comments"];
var expect = 0, result = [];

for (var i = 0; i < lines.length; i++) {
    if (lines[i] == sections[expect] + ":") result[expect++] = []
    else if (result.length && lines[i] != "") result[result.length - 1].push(lines[i]);
}

// result[0][0]="2 people" (first line under "Booking")
// result[1][0]="Firstname Lastname" (first line under "User Details")
// result[1][1]="123456789" (second line under "User Details")
// result[1][2]="facebook.com/username" (second line under "User Details")
// ...
// result[4][0]="Lorem ipsum" (first line under "Comments")
// result[4][1]="dolor sit amet" (third line under "Comments", empty line is skipped)

// If all 5 sections have been found, we extract the variables:

var people, name, phone = "", email = "", fbook = "", extras = "", price, comments = "";

if (result.length == 5)
{

// people = the integer number at the beginning of the 1st line of the 1st section:

    people = parseInt(result[0].shift());

// name = the 1st line of the 2nd section:

    name = result[1].shift();

// The rest of the 2nd section is searched for the phone number, email and facebook.
// Because some of these lines may be missing, we cannot simply use the 
// 1st line for phone, the 2nd line for email and the 3rd for facebook.

    while (result[1].length) {
        var temp = result[1].shift();
        if (temp.search("facebook.com/") == 0) fbook = temp
        else if (temp.search("@") > -1) email = temp
        else phone = temp;
    }

// All the lines in the 3rd section are added to string "extras".
// If the string is not empty, we put a comma between the parts:

    while (result[2].length) {
        if (extras.length) extras += ", ";
        extras += result[2].shift();
    }

// price = the floating-point number at the start of the 1st line of the 4th section:

    price = parseFloat(result[3][0].substring(1));

// All the lines in the 5th section are added to string "comments".
// If the string is not empty, we put a newline between the parts:

    while (result[4].length) {
        if (comments.length) comments += "\n";
        comments += result[4].shift();
    }
}

alert("people: " + people + "\nname: " + name + "\nphone: " + phone + "\nemail: " + email + "\nfbook: " + fbook + "\nextras: " + extras + "\nprice: " + price + "\ncomments: " + comments);

Sign up to request clarification or add additional context in comments.

4 Comments

I followed the example quite strictly, but it's possible that the real input has extra spaces, or \r\n-style newlines... It's possible to change the script to be more lenient with regards to formatting. Can you tell how far the script gets with your input before things go wrong? (You can simply use e.g. alert(lines.length) to check whether the split function has worked, ...)
I looked deeper into it and compared the lines output.. Turned out a few things were different in my input especially since I added a breakdown for the price. After I made it 100% similar to your input string it worked! Now I'm trying to look into it further to make it work for my updated string. Can you stick around? I would need you to explain a few workings of your script.
I'll add some more comments to the script.
I'll write an answer using regex; it's really more practical for this sort of thing.
1

This method uses regex. It's very flexible, especially if you're not sure how the input will be formatted, but it can get quite complicated. This version should be ok with extra spaces, missing data, differently formatted phone numbers, prices with commas and decimal points, empty lines...

var input = "Booking:\n2 people\n\nUser Details:\nFirstname Lastname\n+32 (0)9 123.456.789\[email protected]\nfacebook.com/username\n\nExtras:\nService1\nService2\n\nPricing:\n$1500/-\n\nComments:\nLorem ipsum\n\ndolor sit amet";

var people, name, phone, email, fbook, extras, price, comments, temp;

// split input into 2 parts: data and comments (because the comments could contain any 
// text, including names of sections and other things which may complicate the regex).
var parts = input.match(/^((?:.|\n)*?)\n\s*\n\s*Comments\s*:\s*\n((?:.|\n)*)/i);

if (parts && parts.length > 1)
{
    temp = parts[1].match(/\s*Booking\s*:\s*\n\s*(\d+)\s*(?:person|people)/i);
    if (temp && temp.length == 2) people = temp[1];

    temp = parts[1].match(/\s*User\s*Details\s*:\s*\n\s*(.*?)\n/i);
    if (temp && temp.length == 2) name = temp[1];

    temp = parts[1].match(/\s*User\s*Details\s*:\s*\n(?:.*\n){0,1}\s*([\s\d./()+-]+?)\s*\n/i);
    if (temp && temp.length == 2) phone = temp[1];

    temp = parts[1].match(/\s*User\s*Details\s*:\s*\n(?:.*\n){0,2}\s*(.+?@.+?)\s*\n/i);
    if (temp && temp.length == 2) email = temp[1];

    temp = parts[1].match(/\s*User\s*Details\s*:\s*\n(?:.*\n){0,3}\s*(facebook.com\/.+?)\s*\n/i);
    if (temp && temp.length == 2) fbook = temp[1];

    temp = parts[1].match(/\s*Extras\s*:\s*\n((?:.*\n?)*?)\n\s*Pricing:\s*\n/i);
    if (temp && temp.length == 2) extras = temp[1].replace(/\n+/, ", ").replace(/\n+$/, "");

    temp = parts[1].match(/\s*Pricing\s*:\s*\n\s*([$\d,.]+)/i);
    if (temp && temp.length == 2) price = temp[1];

    if (parts.length > 2) comments = parts[2];
}

alert("people: " + people + "\nname: " + name + "\nphone: " + phone + "\nemail: " + email + "\nfbook: " + fbook + "\nextras: " + extras + "\nprice: " + price + "\ncomments: " + comments);

5 Comments

Every time you see \s* that's just there in case there are extra spaces in the input.
This is a decent overview of regular expressions in javascript: developer.mozilla.org/en/docs/Web/JavaScript/Guide/…
Hi! Well thanks to your extensive explanation of the (previous) code, I have altered it to work with my updated string.. Now I understand 90% of it except for the initial for loop { for (var i in lines) } that breaks the sections into 2D-array.. and the other while loop where we put commas in between extras { if (extras.length) extras += ","; } cause the extras.length basically = total number of letters in extras.. so how.. Once again, I really appreciate your time and effort! Please also suggest if I should go for the regex solution or stick with the 1st one seeing what I'm comfortable with.
I'd use the regex version, because it's more flexible; look at how the booking line can be "2 people" or "1 person", and "User Details" can be "UserDetails" or "User Details", and the phone, email and facebook lines can be missing without causing problems. In the long run you'll be glad if you learn about regex. But if you're in a hurry to finish a project, stick to the first version for now.
There's one downside to regex: it can be slower that other javascript functions; so if you're running thousands of regex's, it may take a few seconds.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.