-2

Hope you all are doing well. I need a simple solution to my problem. I want to write a python code which takes C code as a string. Then regex like this r"#([^}]+)>|#([^}]+)\.h" will detect the headers files part (From # include to '>' or '.h'). I also wanted to extract both group1 and group2 regex part. Then python code will extract and remove that header part of the C code and save it in 'any' variable. Then I want remaining code to be extracted as any other variable.

For example str1=

#include <iostream>
#include <string>
#include conio.h` 

and str2=

void main (void)
{
int b = 32;
int a=34;
int wao= 35;
}

The input source code is:

#include <iostream>
#include <string>
#include conio.h

void main (void)
{
int b = 32;
int a=34;
int wao= 35;
}
3
  • Don't parse a language like C with a regex. It's too complex for a regex to handle. Commented Jan 26, 2021 at 22:11
  • I know it's complex to handle but I really needed that for my project. Commented Jan 26, 2021 at 22:11
  • If u know any other method then kindly let me know so that I can transform it. Commented Jan 26, 2021 at 22:13

1 Answer 1

1

First of all all include in C are going to be in a single line (check this) so you can instead use #(.*) (I've found it works better for this) and validate later. if you use the regex that you have in the post you are going to have trouble in the next step (try it your self)

if src_code is your input source code then

headers = list(re.finditer(r"#(.*)", src_code)) # Extract all the header matches

as a result

[<re.Match object; span=(1, 20), match='#include <iostream>'>,
 <re.Match object; span=(21, 38), match='#include <string>'>,
 <re.Match object; span=(39, 55), match='#include conio.h'>]

You can now get all the header strings in a list ( which you may verify if are valid if you want)

headers = [i.group() for i in headers] # get the match from above

And you can remove all the #include from the source code using re.sub

src_code = re.sub("#(.*)(\n+)", "", src_code) # Also remove any `\n` coming after

And there you have it

  • src_code
void main (void)
{
int b = 32;
int a=34;
int wao= 35;
}
  • for headers you can use "\n".join(headers) but note that they may be declared inside scopes aka brackets {} (functions, structs or raw etc)
['#include <iostream>', '#include <string>', '#include conio.h']
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.