0

I'm trying to match this recurring pattern in a json file:

{ 
    "date":1568381400,
    "open":301.7799987792969,
    "high":302.1700134277344,
    "low":300.67999267578125,
    "close":301.0899963378906,
    "volume":61426700,
    "adjclose":301.0899963378906
}

Note: The above is the formatted version. The actual json is all one line (optional whitespace removed).

There are a bunch of them separated by commas, no spaces. I use the following code:

while ( $Page =~ /{"date":(.+?),.+?"high":(.+?),"low":(.+?),"close":(.+?),"volume":(.+?),.+?"adjclose":(.+?)}/g )

The regex returns TWO examples of the pattern for each such call. eg $& returns:

at84: matched:{"date":1623182400,"open":91.48999786376953,"high":92.379997253417
97,"low":90.77999877929688,"close":92,"volume":15404,"adjclose":92},{"date":1623072600,"open":89.80999755859375,"high":91.3499984741211,"low":89.80999755859375,"close":90.75,"volume":36200,"adjclose":90.75}

It never matches more nor less than precisely 2 of the patterns.

I've tried adding a '?' at the end of the pattern, which does nothing.

I suppose I could change the loop to index the commas or the {} block, but that would add a layer of kludge I'd like to avoid.

Has anyone any suggestions?

1
  • Are you missing a quote char before date in your sample input? Any other errors? Commented Jun 9, 2021 at 2:39

2 Answers 2

3

Your regex forces a requirement that doesn't exist:

"date":(.+?),.+?"high":(.+?),"low":(.+?),"close":(.+?),"volume":(.+?),.+?"adjclose":(.+?)
                                                                       ↑
                                                                       │
                       "+" requires characters but there are none ─────┘

Your input has no characters between the comma after "volume" and "adjclose", so it has to consume input all the way to the end of the next intended match to make the match.

Change:

"volume":(.+?),.+?"adjclose":(.+?)

To:

"volume":(.+?),.*?"adjclose":(.+?)

I would change every (.+?) to (.*?).

Sign up to request clarification or add additional context in comments.

2 Comments

The edit to the question by a third party changes the structure of the JSON (now "There are a bunch of them separated by commas, no spaces." as stated by the OP is not true any more, neither is your statement "Your input has no characters between the comma after "volume" and "adjclose""). Should the edit be rolled back, or does it not invalidate your answer?
@WaiHaLee however the sample output agrees with the original question. I've edited the question to make this clear.
0

In question OP refers to JSON format, well then perhaps input data can be processed with use JSON module. The question does not provide enough information about input data;

Please inspect following code snippet for compliance with your problem.

NOTE: posted question has very limited subset of input data

use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my %data;
my $input = do { local $/; <DATA> };

my($symbol,$block) = $input =~ /(.+?): matched:(.*)/;

for ( $block =~ /{(.*?)}/g ) {
    my %day = $_ =~ /"(.+?)":([\d\.]+)/g;
    $day{date} = localtime($day{date});
    push @{$data{$symbol}}, \%day;
}

say Dumper(\%data);

exit 0;

__DATA__
at84: matched:{"date":1623182400,"open":91.48999786376953,"high":92.37999725341797,"low":90.77999877929688,"close":92,"volume":15404,"adjclose":92},{"date":1623072600,"open":89.80999755859375,"high":91.3499984741211,"low":89.80999755859375,"close":90.75,"volume":36200,"adjclose":90.75}

Output

$VAR1 = {
          'at84' => [
                      {
                        'low' => '90.77999877929688',
                        'volume' => '15404',
                        'date' => 'Tue Jun  8 16:00:00 2021',
                        'high' => '92.37999725341797',
                        'adjclose' => '92',
                        'open' => '91.48999786376953',
                        'close' => '92'
                      },
                      {
                        'adjclose' => '90.75',
                        'open' => '89.80999755859375',
                        'close' => '90.75',
                        'low' => '89.80999755859375',
                        'volume' => '36200',
                        'date' => 'Mon Jun  7 09:30:00 2021',
                        'high' => '91.3499984741211'
                      }
                    ]
        };

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.