0

I have a string of attachments like this:

"<a href="/departments/Attachments/2043_3162016062557_SharePoint_Health%20Check‌​%20Assessment.docx">SharePoint_Health Check Assessment.docx</a><br><a href="/departments/Attachments/2043_3162016062557_Test%20Workflow.docx">Tes‌​t Workflow.docx</a><br>" .

and i used this method :

AttachmentName = System.Text.RegularExpressions.Regex.Replace(AttachmentName,  @"<(.|\n)*?>", "String.Empty");

and i got result :

SharePoint_Health Check Assessment.docxTest Workflow.docx

How can i split the string using c# and get the result with each file name seperately like :

SharePoint_Health Check Assessment.docx

Test Workflow.docx

and then show them into some control one by one.

and after that i want just the URL of the string like "http://srumos1/departments/Attachments/2053_3172016093545_ITPCTemplate.txt" and "http://srumos1/departments/Attachments/2053_3172016093545_ITPCTemplate.txt"

how can i do that

15
  • 3
    With no delimiter between the filenames and no consistent length of the extensions, you're going to have a difficult time accomplishing this. How are you getting the original string, and is it possible to put a delimiter (like a comma, or something else that's not acceptable in a file name) between the file names? Commented Mar 16, 2016 at 15:11
  • 3
    If you have a list of expected extensions, it's doable; otherwise, as CoderHxr implies, it's a quagmire. If it's possible to get the vals delimied, you can use Split() and then assign them to a control using DataSource. Commented Mar 16, 2016 at 15:14
  • 1
    I'm not a RegEx-kenner, but maybe replacing "string.Empty" with ";" or some such would give you the delimitation that would make this a lot easier. Commented Mar 16, 2016 at 15:16
  • 1
    Do all your file names start with test? Commented Mar 16, 2016 at 15:20
  • 1
    Can you also add how the data looks before you ran AttachmentName = System.Text.RegularExpressions.Regex.Replace(AttachmentName, @"<(.|\n)*?>", string.Empty); It would be useful to see the raw data before you took out the delimeters Commented Mar 16, 2016 at 15:38

2 Answers 2

2

i got it this way AttachmentName = Regex.Replace(AttachmentName, @"<(.|\n)*?>", string.Empty);

Well there's your problem. You had valid delimiter but stripped them away for some reason. Leave the delimiters there and use String.Split to split them based on that delimiter.

Or replace the HTML with a delimiter instead of an empty string:

AttachmentName = Regex.Replace(AttachmentName, @"<(.|\n)*?>", "|");

And then split based off of that:

string[] filenames = AttachmentName.Split(new [] {'|'},
                                          StringSplitOptions.RemoveEmptyEntries);
Sign up to request clarification or add additional context in comments.

6 Comments

@Revenant_01 I don't know - what got stripped from the HTML?
this was the original string : <a href="/departments/ITPC/Attachments/2043_3162016062557_SharePoint_Health%20Check‌​%20Assessment.docx">SharePoint_Health Check Assessment.docx</a><br><a href="/departments/ITPC/Attachments/2043_3162016062557_Test%20Workflow.docx">Tes‌​t Workflow.docx</a><br> quiet beautiful huh .
and after removing tags : SharePoint_Health Check Assessment.docxTest Workflow.docx
You could also replace the HTML with "|" instead of an empty string and use that as a delimiter.
i used ',' and the result is this : ,SharePoint_Health Check Assessment.docx,,,Test Workflow.docx,,
|
2

You can use a regex for extracting file names if you do not have any other clear way to do that. Can you try the code below ?;

using System;
using System.Collections.Generic;
using System.Text;
using System.Linq;
using System.Text.RegularExpressions;

namespace ExtensionExtractingTest
{
    class Program
    {
        static void Main(string[] args)
        {
            string fileNames = "test.docxtest2.txttest3.pdftest.test.xlxtest.docxtest2.txttest3.pdftest.test.xlxtest.docxtest2.txttest3.pdftest.test.xlxourtest.txtnewtest.pdfstackoverflow.pdf";

            //Add your extensions to regex definition
            Regex fileNameMatchRegex = new Regex(@"[a-zA-Z0-9]*(\.txt|\.pdf|\.docx|\.txt|\.xlx)", RegexOptions.IgnoreCase);
            MatchCollection matchResult = fileNameMatchRegex.Matches(fileNames);
            List<string> fileNamesList = new List<string>();
            foreach (Match item in matchResult)
            {
                fileNamesList.Add(item.Value);
            }
            fileNamesList = fileNamesList.Distinct().ToList();

            Console.WriteLine(string.Join(";", fileNamesList));
        }
    }
}

And a working example is here http://ideone.com/gbopSe

PS: Please keep in mind you have to know your file name extensions or you have to predict filename extension length 3 or 4 and that will be a painful string parsing operation.

Hope this helps

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.