5

I have this string:

var htmlString;

Assigned to:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" >
<html>
<head>
<title>Payment Receipt</title>
<link rel="stylesheet" type="text/css" href="content/PaymentForm.css">
<style type="text/css">
</style>
<meta content='text/html; charset=UTF-8' http-equiv='Content-Type'/>
</head>
<body>
<div id="divPageOuter" class="PageOuter">
    <div id="divPage" class="Page">
        <!--[1]-->
        <div id="divThankYou">
             Thank you for your order!
        </div>
        <hr class="HrTop">
        <div id="divReceiptMsg">
             You may print this receipt page for your records.
        </div>
        <div class="SectionBar">
             Order Information
        </div>
        <table id="tablePaymentDetails1Rcpt">
        <tr>
            <td class="LabelColInfo1R">
                 Merchant:
            </td>
            <td class="DataColInfo1R">
                <!--Merchant.val-->
                Ryan
                <!--end-->
            </td>
        </tr>
        <tr>
            <td class="LabelColInfo1R">
                 Description:
            </td>
            <td class="DataColInfo1R">
                <!--x_description.val-->
                Rasmussenpayment
                <!--end-->
            </td>
        </tr>
        </table>
        <table id="tablePaymentDetails2Rcpt" cellspacing="0" cellpadding="0">
        <tr>
            <td id="tdPaymentDetails2Rcpt1">
                <table>
                <tr>
                    <td class="LabelColInfo1R">
                         Date/Time:
                    </td>
                    <td class="DataColInfo1R">
                        <!--Date/Time.val-->
                        09-Jul-2012 12:26:46 PM PT
                        <!--end-->
                    </td>
                </tr>
                <tr>
                    <td class="LabelColInfo1R">
                         Customer ID:
                    </td>
                    <td class="DataColInfo1R">
                        <!--x_cust_id.val-->
                        <!--end-->
                    </td>
                </tr>
                </table>
            </td>
            <td id="tdPaymentDetails2Rcpt2">
                <table>
                <tr>
                    <td class="LabelColInfo1R">
                         Invoice Number:
                    </td>
                    <td class="DataColInfo1R">
                        <!--x_invoice_num.val-->
                        176966244
                        <!--end-->
                    </td>
                </tr>
                </table>
            </td>
        </tr>
        </table>
        <hr id="hrBillingShippingBefore">
        <table id="tableBillingShipping">
        <tr>
            <td id="tdBillingInformation">
                <div class="Label">
                     Billing Information
                </div>
                <div id="divBillingInformation">
                     Test14 Rasmussen<br>
                    1234 test st<br>
                    San Diego, CA 92107 <br>
                </div>
            </td>
            <td id="tdShippingInformation">
                <div class="Label">
                     Shipping Information
                </div>
                <div id="divShippingInformation">
                </div>
            </td>
        </tr>
        </table>
        <hr id="hrBillingShippingAfter">
        <div id="divOrderDetailsBottomR">
            <table id="tableOrderDetailsBottom">
            <tr>
                <td class="LabelColTotal">
                     Total:
                </td>
                <td class="DescrColTotal">
                     &nbsp;
                </td>
                <td class="DataColTotal">
                    <!--x_amount.val-->
                    US&nbsp;$250.00
                    <!--end-->
                </td>
            </tr>
            </table>
            <!-- tableOrderDetailsBottom -->
        </div>
        <div id="divOrderDetailsBottomSpacerR">
        </div>
        <div class="SectionBar">
             Visa ****0027
        </div>
        <table class="PaymentSectionTable" cellspacing="0" cellpadding="0">
        <tr>
            <td class="PaymentSection1">
                <table>
                <tr>
                    <td class="LabelColInfo2R">
                         Date/Time:
                    </td>
                    <td class="DataColInfo2R">
                        <!--Date/Time.1.val-->
                        09-Jul-2012 12:26:46 PM PT
                        <!--end-->
                    </td>
                </tr>
                <tr>
                    <td class="LabelColInfo2R">
                         Transaction ID:
                    </td>
                    <td class="DataColInfo2R">
                        <!--Transaction ID.1.val-->
                        2173493354
                        <!--end-->
                    </td>
                </tr>
                <tr>
                    <td class="LabelColInfo2R">
                        Authorization Code:
                    </td>
                    <td class="DataColInfo2R">
                        <!--x_auth_code.1.val-->
                        07I3DH
                        <!--end-->
                    </td>
                </tr>
                <tr>
                    <td class="LabelColInfo2R">
                         Payment Method:
                    </td>
                    <td class="DataColInfo2R">
                        <!--x_method.1.val-->
                        Visa ****0027
                        <!--end-->
                    </td>
                </tr>
                </table>
            </td>
            <td class="PaymentSection2">
                <table>
                </table>
            </td>
        </tr>
        </table>
        <div class="PaymentSectionSpacer">
        </div>
    </div>
    <!-- entire BODY -->
</div>
<div class="PageAfter">
</div>
</body>
</html>

And I want to find the location of "x_auth_code.1.val" in the string. And then I want to obtain a string from the location plus a certain number of characters. The goal would be to return the Authorization code.

2

2 Answers 2

17

You can use indexOfSlice, and then slice() in StringOps

scala> val myString = "Hello World!"
myString: java.lang.String = Hello World!

scala> val index = myString.indexOfSlice("Wo")
index: Int = 6

scala> val slice = myString.slice(index, index+5)
slice: String = World

With your html string:

scala> htmlString.indexOfSlice("x_auth_code.1.val")
res4: Int = 2771
Sign up to request clarification or add additional context in comments.

1 Comment

The indexOfSlice is a good addition to one's tool suite due to its worst case BigO runtime complexity
0

Why aren't you using an XML parser? Don't treat XML as strings -- you'll get bitten if you do.

Here's a regex to do it, but my advice is: DO NOT USE IT! Use xml tools.

"""\Qx_auth_code.1.val\E[^>]*>([^<]*)""".r.findFirstMatchIn(htmlString).map(_ group 1)

5 Comments

He's not using an XML parser, because he isn't dealing with XML. It's HTML 4.0 transitional. Of course there are parsers for that, but it might still be malformed. So if you want to recommend a parser, maybe something like jsoup would be good. But if he really just needs to extract one String, that is overkill.
@KimStebel There are plenty tools that handle such. For example, Tag Soup and JTidy.
Yes, and all of them are probably overkill.
@KimStebel TagSoup is a lightweight SAX parser, so it's less overkill than the standard SAX parser.
By overkill I meant it's one more dependency and probably doesn't simplify his code.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.