2

I am trying to write a compiler using LLVM C++ API, and I am trying to access array parameter in a function.

As far as I can tell from clang's IR generation, those two codes has different LLVM IR codes:

void foo(void) {
  int a[2];
  a[0] = 1;
  // %1 = getelementptr inbounds [2 x i32], [2 x i32]* %0, i32 0, i32 0
  // store i32 1, i32* %1
void bar(int a[]) {
  a[0] = 1;
  // store i32* %0, i32** %3, align 8
  // %4 = load i32*, i32** %3, align 8
  // %5 = getelementptr inbounds i32, i32* %4, i64 0
  // store i32 1, i32* %5, align 4

If passing an array as a parameter, I need to use builder.CreateStore() first, then use llvm::GetElementPtrInst::CreateInBounds() to get the pointer to the index.

However, when writing the compiler, I am using the visitor pattern and see codes like a[0] = 1 as assign expression. When visiting the tree node assign_expression, I need to determine whether the load is needed.

Is there a way to determine whether the array is a local defined variable or a parameter?

Update 1: for example, in C, if a function is defined like this:

void test(int a[]) {
  a[0] = 1;
}

the corresponding LLVM C++ code for a[0] = 1 is like:

for(auto arg = theFunction->arg_begin(); arg != theFunction->arg_end(); arg ++) {
  auto param = builder.CreateAlloca(llvm::Type::getInt32Ty(context)->getPointerTo());
  builder.CreateStore(arg, param);
}

// a[0] = 1
auto loaded_tmp = builder.CreateLoad(param);
auto value = llvm::GetElementPtrInst::CreateInBounds(tmp, {Const(0), Const(0)}, "", the_basic_block);

However, when the array is defined local, the code auto loaded_tmp = builder.CreateLoad(param); is not needed. So my question is: how do I get to know if I need the CreateLoad?

Update 2: The LLVM IR generated by clang for the following C code :

int h(int a[]) {
    a[0] = 1;
    a[1] = 2;
}

is

define dso_local i32 @h(i32*) #0 {
  %2 = alloca i32, align 4
  %3 = alloca i32*, align 8
  store i32* %0, i32** %3, align 8
  %4 = load i32*, i32** %3, align 8
  %5 = getelementptr inbounds i32, i32* %4, i64 0
  store i32 1, i32* %5, align 4
  %6 = load i32*, i32** %3, align 8
  %7 = getelementptr inbounds i32, i32* %6, i64 1
  store i32 2, i32* %7, align 4
  %8 = load i32, i32* %2, align 4
  ret i32 %8
}

which has a load instruction before each store

5
  • You can test a value using e.g. if(isa<Argument>(value)){… but I have a feeling that you're doing something bad. Perhaps my intuition is you shouldn't need to load sometimes, it should be always or neither. Anyway, isa is what you want now. Have fun. Commented Feb 14, 2020 at 9:02
  • @arnt Hi, as you can see in my post, in a function where an array is passed as a parameter, the IR clang generated always use a store before getelementptr. However, in the function where the array is locally defined, there is no store. So I think it is required to check if the array in the code a[0] = 1 expression is locally defined or passed as a parameter. Thanks for your reply, I will try isa to see if it works. Commented Feb 14, 2020 at 10:42
  • Oh, wait, I understand now. Not everything clang does is required by LLVM. Some things are required by LLVM, some other things are required by clang's own architecture, yet other things are required by the C/C++ ABI in use, and finally clang does many things that could also be done in other ways. Commented Feb 14, 2020 at 10:46
  • @arnt Hi, I think my problem is LLVM related. I have updated my post, can you please check it out? Commented Feb 14, 2020 at 11:07
  • I don't work in a C/C++ compiler. My compiler never would generate such a store. Just guessing here: Clang could create debug info that says the variable is accessible to debuggers at such-and-such location, and is emitting extra instructions to store the variable where debuggers will look for it, just in case debug info will eventually be created and a debugger used. That's guesswork, I repeat, guesswork. Commented Feb 14, 2020 at 11:21

1 Answer 1

1

Is there a way to determine whether the array is a local defined variable or a parameter?

Method1:

Use isa<Argument> like this:

Analysis/BasicAliasAnalysis.cpp

 167   if (isa<Argument>(V))
 168     return true;

Method2:

Use a combination of LLVMGetFirstParam LLVMGetLastParam and LLVMGetNextParam to find if a value is a parameter.

See: https://github.com/llvm/llvm-project/blob/master/llvm/lib/IR/Core.cpp#L2456

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for your reply, however, there is a store function before the load instruction, like the following llvm ir: store i32* %0, i32** %3; %4 = load i32*, i32** %3, making the value to load is not the argument of the function...
It seems running mem2reg before your analysis should get rid of redundant store+load pairs. If that is not possible then you'd need to follow the def-use chain until the last one and use isa<Argument> to your advantage.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.