3

I have a generic C# list List<Result> Results which has several fields with several rows of data in it as follows of type RESULT

public class Result
{
    public int TrdID;
    public string Term;
    public double Price;
    public string Seller;
    public string Buyer;
    public bigint Quantity;
}

and the data is as follows

TrdID    Term      Price  Seller Buyer  Quantity
1000     201508      1     ABC    XYZ   10
1010     201508      2      ACB   PQR   10
1002     201507      1.5   ABX    PEW   20 
1002     201506      1.5   ABX    PEW   20
1002     201508      1.5   ABX    PEW   20

Now, I want to process the above data and return a single for a single unique ID provided for a particular ID all columns will be same except Term. For example I have three records for ID 1002, I want to merge them to a single record by converting Term to a DateTime and introducing two new columns saying startDate and EndDate and the expected results are as follows

Expected Result

   TrdID     StartDate     EndDate      Price  Seller Buyer  Quantity
    1000     08/01/2015    08/31/2015    1      ABC    XYZ   10
    1010     08/01/2015    08/31/2015    2      ACB    PQR   10
    1002     06/01/2015    08/31/2015    1.5    ABX    PEW   60 

From the above data, one can see that I have two new columns named StartDate and EndDate instead of Term Column which is used to get the values for both StartDate and EndDate. Also, there is a change to Quantity in the last row which is sum of all Quantities for a particular ID ie., No.Of times ID repeated*Quantity

Here, I have two cases to solve

Case1: If there is only one record for particular ID say, 1000 then StartDate will be the first day of the month in Term column and EndDate will be the last day of the month in Term ie 08/01/2015 and 08/31/2015. No change to Quantity

Case2: If there are multiple records for a particular ID say 1002(all terms will be in a sequence) then StartDate will be first day of least month ie., 06/01/2015 and EndDate will be last day of the highest month of Term column ie., 08/31/2015. Quantity=No.Of times ID repeated*Quantity ie., 20*3=60

My resultant list will of type

public class ProcessedResult
{
    public int TrdID;
    public DateTime StartDate;
    public DateTime EndDate;
    public double Price;
    public string Seller;
    public string Buyer;
    public bigint Quantity;
}

May I know a good way to solve this?

2
  • 2
    What have you tried? Where are you stuck? GroupBy, Min, and Max seem like reasonable functions to use. Commented Jul 23, 2015 at 18:38
  • 1
    When there are multiple records with the same TrdID, will the values of Price, Seller, Buyer and Quantity be the same for all of the records? Commented Jul 23, 2015 at 18:41

1 Answer 1

3

Provided that the seller and the buyer would be the same as in the example you presented in your post, you need something like this:

var result = list.GroupBy( result => result.TrdID)
                 .Select( gr => {
                 var startDate = DateTime.ParseExact(gr.OrderBy(x => x.Term).First().Term, "yyyyMM", CultureInfo.InvariantCulture);
                 var endDate = DateTime.ParseExact(gr.OrderByDescending(x => x.Term).First().Term, "yyyyMM", CultureInfo.InvariantCulture);                      
                 return  new ProcessedResult
                 {
                     TrdID = gr.Key
                     StartDate = new DateTime(startDate.Year, startDate.Month, 1),
                     EndDate = (new DateTime(endDate.Year, endDate.Month, 1)).AddMonth(1).AddDays(-1),
                     Price = gr.First().Price,
                     Seller = gr.First().Seller,
                     Buyer = gr.First().Buyer,
                     Quantity = gr.Sum(x=>x.Quantity)
                 } 
             });

Essentially, we make a group by of the results (list contains your results). The trick is done on how we construct the StartDate and EndDate. There we make an order by of the rows in the same group and we get the first one in order to get the StartDate's string representation. Then we use the DateTime.ParseExact method, because we know the exact format of the DateTime. As far as the EndDate we make an OrderByDescending, in order we get the last date - we could make an OrderBy and get the last element, it's equivalent. Then we apply the DateTime.ParseExact again.

Sign up to request clarification or add additional context in comments.

9 Comments

Correcton: An unhandled exception of type 'System.FormatException' occurred in mscorlib.dll Additional information: String was not recognized as a valid DateTime.This is because the Term is an string in my list
And I dont think this gives me StartDate as the first day of that month and EndDate as last day of the month that is in Term string field
I have made a minor change to your answer to remove the exception in the first comment by DateTime.ParseExact(gr.OrderBy(x=>x.Term).First().Term but this still doesn't give me the correct EndDate as 08/31/2015 instead it give 08/01/2015
@Dev you are correct. My mistake :(. Please have a look know.
startDate and endDate again will just be of form yyyyMM and how come you use SubString(4,6) for just '201508`
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.