0

I'm trying to pass an array as a parameter and use this array in another function with a Toy lang like C.

My code run and compile well when I compiler the following code

int at_index(int a[], int index) {
  return 0;
}

int main() {
  int a[10];
  int tmp;
  tmp = at_index(a, 0);
  print(tmp);
  return 0;
}

My compiler generates the following IR

; ModuleID = 'MicrocC-module'
source_filename = "MicrocC-module"

declare void @print(i32)

declare i32 @getint()

define i32 @at_index(i32* %0, i32 %1) {
entry:
  %a = alloca i32*
  store i32* %0, i32** %a
  %index = alloca i32
  store i32 %1, i32* %index
  ret i32 0
}

define i32 @main() {
entry:
  %a = alloca [10 x i32]
  %tmp = alloca i32
  %a1 = getelementptr inbounds [10 x i32], [10 x i32]* %a, i32 0, i32 0
  %0 = call i32 @at_index(i32* %a1, i32 0)
  store i32 %0, i32* %tmp
  %tmp2 = load i32, i32* %tmp
  call void @print(i32 %tmp2)
  ret i32 0
}

But if I try to use the array an in the function I receive a core dump from the compiler, my demo to have core dump is

int at_index(int a[], int index) {
  a[0] = 0;
  return 0;
}

int main() {
  int a[10];
  int tmp;
  tmp = at_index(a, 0);
  print(tmp);
  return 0;
}

I start to debug the code but I'm not able to find the error, and maybe my access in the array from the code is wrong.

My general code to access in position to an array is to call the following LLVM API with Ocaml

I receive from the AST a node with the access operation as variable and the index and I call the following API.

Llvm.build_in_bounds_gep variable [0, index] "access" llvm_builder

with the result of this call, I make all the operation on the variable, but I'm thinking that the case of the function body when the array is as a parameter, my variable is a pointer and for this reason, I receive the error, but this is my idea and it is where I'm stuck, any idea? there is some additional operation to do to access in an array where it is a parameter?

Update

The portion of code where I'm generating the function call are.

The rules to generate the alloc and store inside the function declaration

let rec translate_stm_to_llvm_ir llvm_builder stm_def =
  match stm_def.node with
  | Ast.Dec(tipe, id) ->
    begin
      logger#debug "Declaration stm processing for type %s" (Ast.show_typ tipe);
      match tipe with
      | Ast.TypArray(arr_tipe, size) ->
        begin
          let llvm_arr = gen_array_type (to_llvm_type arr_tipe) size in
          let llvm_val = Llvm.build_alloca llvm_arr id llvm_builder in
          (>->) id llvm_val false
        end
      | _ ->
        begin
          logger#trace "Literal variable build with LLVM";
          let all_llvm = Llvm.build_alloca (to_llvm_type tipe) id llvm_builder in
          (>->) id all_llvm false
        end
    end

How during the function call I translate the parameter, eg: Create a pointer for the int array[] and a value for a[i]

and translate_fun_exp_to_llvm exp llvm_builder =
    match exp.node with
    | Assign(variable, exp) ->
      begin
        logger#trace "*Assign stm* translating ...";
        let llvm_var = translate_acc_var variable llvm_builder in
        let exp_to_assign = translate_exp_to_llvm exp llvm_builder in
        let _ = Llvm.build_store exp_to_assign llvm_var llvm_builder in
        Llvm.build_load llvm_var "" llvm_builder
      end
    | Access(access) ->
      begin
        match access.node with
        | Ast.AccVar(id) ->
          begin
            logger#error "Access on var inside a function call";
            try
              let acc_var = (<-<) id in
              let type_var = Llvm.type_of acc_var in
              let name_var = Llvm.value_name acc_var in
              match (Llvm.string_of_lltype type_var) with
              | "i32*" | "i8*" | "i1*" ->
                logger#error "lvalue in function call";
                Llvm.build_load acc_var name_var llvm_builder
              | _ ->
                begin
                  logger#trace "Access to first element of the array";
                  let first_pos = Llvm.const_int gen_int_type 0 in
                  Llvm.build_in_bounds_gep acc_var (Array.of_list [first_pos; first_pos]) name_var llvm_builder
                end
            with Not_found -> failwith "Variable not found"
          end
        | _ ->
          begin
            let acc_var = translate_acc_var access llvm_builder in
            Llvm.build_load acc_var "" llvm_builder

How I translate the access variable in the function

and translate_acc_var acc_def llvm_builder =
  match acc_def.node with
  | Ast.AccVar(id) ->
    begin
      logger#trace "Access variable with id %s ...." id;
      try (<-<) id
      with Not_found -> failwith "Variable not found"
    end
  | AccIndex(acc, index) ->
    begin
      let variable = translate_acc_var acc llvm_builder in
      let index_exp = translate_exp_to_llvm index llvm_builder in
      let zeros_pos = Llvm.const_int gen_int_type 0 in
      Llvm.build_in_bounds_gep variable (Array.of_list([zeros_pos; index_exp])) "" llvm_builder
    end
  | AccDeref (expr) as node->
    begin
      logger#debug "* *%s * Translating ..." (show_access_node node);
      let llval = translate_exp_to_llvm expr llvm_builder in
      let val_name = Llvm.value_name llval in
      let type_val = Llvm.type_of llval in
      Llvm.build_ptrtoint llval type_val val_name llvm_builder
    end
  | _ -> failwith "Access var not implemented"

Where the AST is formed with the following rules

type typ =
  | TypInt                             (* Type int                    *)
  | TypBool                            (* Type bool                   *)
  | TypChar                            (* Type char                   *)
  | TypArray of typ * int option       (* Array type                  *)
  | TypPoint of typ                    (* Pointer type                *)
  | TypVoid                            (* Type void                   *)
  [@@deriving show]

and expr = expr_node annotated_node
and expr_node =
  | Access of access                 (* x    or  *p    or  a[e]     *)
  | Assign of access * expr          (* x=e  or  *p=e  or  a[e]=e   *)
  | Addr of access                   (* &x   or  &*p   or  &a[e]    *)
  | ILiteral of int                  (* Integer literal             *)
  | CLiteral of char                 (* Char literal                *)
  | BLiteral of bool                 (* Bool literal                *)
  | UnaryOp of uop * expr            (* Unary primitive operator    *)
  | BinaryOp of binop * expr * expr  (* Binary primitive operator   *)
  | Call of identifier * expr list   (* Function call f(...)        *)
  [@@deriving show]

and access = access_node annotated_node
and access_node =
  | AccVar of identifier             (* Variable access        x    *)
  | AccDeref of expr                 (* Pointer dereferencing  *p   *)
  | AccIndex of access * expr        (* Array indexing         a[e] *)
  [@@deriving show]

With an introduction of a logger, I get the last value before to call the getelementptr inbounds

The value that I have at the moment are

[0.042  Trace      CodeGen              ] Variable ->   %2 = load i32*, i32** %a
[0.042  Trace      CodeGen              ] index ->   %3 = load i32, i32* %index
5
  • Please post a minimal reproducible example. Right now there are several things unclear to me about the one line of code you posted: Does variable refer to the alloca, the result of loading the alloca or something else (such as %0 for example)? Same question for index. And then I'm assuming you're using the GEP as an operand to store? Commented Jan 19, 2021 at 11:20
  • On another note, if your compiler crashes with a segmentation fault, that might mean that you're running an LLVM build with assertions disabled. In that case you should rebuild LLVM with assertions enabled, so you'll get assertion errors instead of segmentation faults, which will be more helpful in figuring out what's wrong. Commented Jan 19, 2021 at 11:29
  • For the first comment, I update a portion of the real code, to show how the call are marked, sorry if it not a runnable example but if it not enough I will work to make a runnable example that make sense and not Is a general one Commented Jan 19, 2021 at 11:32
  • for the second comment, can you link me some reference to do this, with OCaml ? or maybe I need to build LLVM from source? Commented Jan 19, 2021 at 11:33
  • 1
    You'll need to build LLVM from source, yes (unless there's already a binary distribution for your platform with assertions enabled, but I'm not aware of any). llvm.org/docs/CMake.html - you'll want to either set the build type to debug (which will also give you debug symbols, but require a lot of RAM to link and give you huge binaries) or set LLVM_ENABLE_ASSERTIONS to true. Commented Jan 19, 2021 at 11:45

1 Answer 1

2

Looking at your code, it looks like variable refers to %a, an alloca of type i32**, and index to %index, an alloca of type %i32. This introduces two problems:

  1. A GEP is just an address calculation - it doesn't dereference anything. Therefore you can't use it to go through a double pointer like this. You'll need to dereference the alloca first using a load and then use the result of the load in the GEP. You should then also drop the 0, since you'll only have a single pointer after the load, requiring only one index.
  2. The indices in a GEP must be integers, not pointers to integers. So again, you'll want to load the index and use the result of the load in the GEP.
Sign up to request clarification or add additional context in comments.

9 Comments

So, only to understanding during the build on my LLVM with assertion, I need to create an alloca and a build also for the Llvm.const_int when I need to make access at pointer with group operation.
A to point 1 I need you mean that I need and extra alloca and store for each access in position? in other words I must not return the llvalue from build_in_bounds_gep but I need to make ad additional store and load, right?
@vincenzopalazzo No, I'm saying that you'll need to create a load that dereferences variable and then use that load as an argument to build_in_bounds_gep. Same for index. The const_int is fine. That's the one operand of the GEP that has the correct type already.
Funny :-D I have only one correct, type; More in general, when I pass the variable to a function, I need to make the alloca, store and load operation. I will try and I cam back in a moment
@vincenzopalazzo Oh, yes, you're right. Since a evaluated to an i32*, not an [i32 x n]*, there should only be one index (which should be index). I'll add this to my answer.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.