-2

I am trying to extract a string of data from text file that I downloaded using EWS. I am using powershell to do this. A snippet of the file is as follows.

<table class="button" style="border-collapse: collapse; border-spacing: 0; overflow: 
hidden; padding: 0; text-align: left; vertical-align: top; width: 100%;"><tbody>
<tr style="padding: 0; text-align: left; vertical-align: top;"><td style="-moz-hyphens: none; 
-webkit-hyphens: none; -webkit-text-size-adjust: none; background: #049FD9; 
border: none; border-collapse: collapse !important; border-radius: 2px; color: #fff; display: block; font-family: 'Helvetica-Light','Arial',sans-serif; font-size: 14px; font-weight: lighter; hyphens: none; line-height:19px; margin: 0; padding: 8px 16px; text-align: center; vertical-align: top; width: auto 
!important; word-break: keep-all;">
<a href="https://www.website.com:443/idb/setPassword?t=BcHJEoIgAADQD%2BKQjqZ4VEKtBHLJJm82uWDuxCR%2Bfe%2B58Rl9HRz6QddWkO5MLDXuF6e9m%2Bo0z%2FCVS%2B9IenAp5m5yTfYRa%2BAn4jdWHHF7HTyqRZiRRiNDEE%2BK7ZJywLKeNCTj4ewu4QNu02qXB0ZTXTyxXADwaLeluZGVPCxGXunpVcHbiCVAWRR7ykqGensLVBsqNUpl%2FQE%3D" 
style="-webkit-text-size-adjust: none; font-weight: 100; color: #fff; font-family: 'Helvetica-Light','Arial',sans-serif; font-size: 20px; font-weight: lighter; line-height: 32px; text-decoration: none;">Get Started</a> </td></tr></tbody></table></td>

I want to extract this part BcHJEoIgAADQD%2BKQjqZ4VEKtBHLJJm82uWDuxCR%2Bfe%2B58Rl9HRz6QddWkO5MLDXuF6e9m%2Bo0z%2FCVS%2B9IenAp5m5yTfYRa%2BAn4jdWHHF7HTyqRZiRRiNDEE%2BK7ZJywLKeNCTj4ewu4QNu02qXB0ZTXTyxXADwaLeluZGVPCxGXunpVcHbiCVAWRR7ykqGensLVBsqNUpl%2FQE%3D

I've tried -matches and using regex lookbehinds and forwards but nothing seems to be able to grab that part only.

Thought something like this might work

$a = Get-Content $path 
$a -match '(?<=setPassword\?t\=)(.+)(?=" style)' 

$matches    

But it comes up blank

2
  • 'SomePrefixAnyString'.Substring('SomePrefix'.Length) should result to AnyString. Commented May 1, 2017 at 17:01
  • Thanks, problem is I dont know what AnyString is. Its always something different. which is why I have to have a starting point and an ending point and grab anything in between. I'm thinking its gotta be some sort of regex...maybe Commented May 1, 2017 at 17:25

2 Answers 2

4

Best not to use string manipulation for this; use existing libraries and classes.

So first of all, treat your URI as a [uri]:

$uri = [System.Uri]'https://www.website.com/idb/setPassword?t=BcHJEoIgAADQD%2BKQjqZ4VEKtBHLJJm82uWDuxCR%2Bfe%2B58Rl9HRz6QddWkO5MLDXuF6e9m%2Bo0z%2FCVS%2B9IenAp5m5yTfYRa%2BAn4jdWHHF7HTyqRZiRRiNDEE%2BK7ZJywLKeNCTj4ewu4QNu02qXB0ZTXTyxXADwaLeluZGVPCxGXunpVcHbiCVAWRR7ykqGensLVBsqNUpl%2FQE%3D'

Now you can get the query string like this:

$query = $uri.Query

That will start with ?t=, so let's parse it:

$queryData = [System.Web.HttpUtility]::ParseQueryString($query)

The resulting object has a set of keys, one for each value. Since the key you want is called t, you can get the value like this:

$queryData['t']
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks for the answer, not sure if this will work because I have to know the URL to start out with. Each link that comes in is different everytime. The only thing that is constant in the link is the website.com/idb/setPassword?t= and the closing quote to the link " So thats why I was needing to be able to search for those to points of reference to extract that long string in between.
@JasonMurray why would that prevent this from working? This method specifically doesn't depend on the URL; the only assumption is that the query parameter you want has a key of t. Just take the string that contains the URL and cast is as [System.Uri] as shown, or use the -as operator. The example is only hardcoding the string because that's what you supplied.
Ah I see what you mean it does work. The issue is I need to pull that full URL out of html/text data that I pulled from exchange which is the body of the email. All that data is set to a variable that I should be able to read from. So I need some way to pull that URL from the data and then parse it like you show to get the info after the t=
@JasonMurray well.. that's outside the scope of your question, but I would recommend parsing the HTML with HtmlAgilityPack.
Updated the question to reflect more of what I was needing an answer on...thanks
0

So I figured out the text document I was outputting to was word wrapping part of the long URL. I ended up outputting with the -Width command so it wouldn't do that.

$email | Out-File $path -Width 9999999

Then I ended up using matches to grab the string I needed using.

$a = Get-Content $path | Where-Object {$_ -match '(?<=https:\/\/www\.website\.com:443\/idb\/setPassword\?t=)(.+)(?=" style)'}
$matches[1]

Hope that helps someone.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.