4

I am running an atmospheric model, and need to compile an executable to convert some files. If I compile the code as supplied, it runs but it gets stuck and doesn't ever complete. It doesn't give an error or anything like that.

After doing some testing by adding print statements to see where it was getting stuck, I've found that the executable only runs if I compile the code with a print statement in one of the subroutines.

The piece of code in question is the one here. Specifically, the code fails to run unless I put a print statement somewhere in the get_bottom_top_dim subroutine.

Does anyone know why this might be? It doesn't matter what the print statement is (currently I'm using print*, '!'). but as soon as I remove it or comment it out, the code no longer works.

I'm assuming it must have something to do with my machine or compiler (ifort 12.1.0), but I'm stumped as to what the problem is!

2
  • Please consider this as a comment. Have you tried to debug using a fortran debugger (on your machine) to identify where the program fails without the print statement and why it continues execution when you put the print statement? Commented Jun 14, 2014 at 20:57
  • 1
    This is a very large code and the main program is missing. Compile it with -check -warn -g -traceback and try to run it again. Commented Jun 15, 2014 at 13:58

3 Answers 3

5

This is an extended comment rather than an answer:

The situation you describe, inserting a print statement which apparently fixes a program, often arises when the underlying problem is due to either

a) an attempt to access an element outside the declared bounds of an array; or

b) a mismatch between dummy and actual arguments to some procedure.

Recompile your program with the compiler options to check interfaces at compile-time and to check array bounds at run-time.

Sign up to request clarification or add additional context in comments.

2 Comments

Hi, can you elaborate more on how putting a print statemnt will stop a run time error such as accessing an array element outside the declared bounds? Is the fortran compiler specific? Thanks.
Unchecked access to an array out of its bounds is not specified by the standard and therefore the program could do anything it wants. Physically, it can mean corrupting memory, which may alter the values of local variables or disrupt the call stack, but that's only the beginning. In particular, the compiler is not guaranteed to compile your program correctly in these cases.
0

Fortran has evolved a LOT since I last used it but here's how to go about solving your problem.

  • Think of some hypotheses that could explain the symptoms, e.g. the compiler is optimizing the subroutine down to a no-op when it has no print side effect. Or a compiler bug is translating this code into something empty or an infinite loop or crashing code. (What exactly do you mean by "fails to run"?) Or the Linker is failing to link in some needed code unless the subroutine explicitly calls print. Or there's a bug in this subroutine and the print statement alters its symptoms e.g. by changing which data gets overwritten by an index-out-of-bounds bug.
  • Think of ways to test these hypotheses. You might already have observations adequate to rule out of some of them. You could decompile the object code to see if this subroutine is empty. Or step through it in a debugger. Or replace the print statement with a different side effect like logging to a file or to an in-memory text buffer. Or turn on all optional runtime memory checks and compile time warnings. Or simplify the code until the problem goes away, then binary search on bringing back code until the problem recurs.
  • Do the most likely or easiest tests first. Rule out some hypotheses, and iterate.

Comments

0

I had a similar bug and I found that the problem was in the dependencies on the makefile. This was what I had:

  1. I set a variable with a value and the program stops.
  2. I write a print command and it works.
  3. I delete the print statement and continues to work.
  4. I alter the variable value and stops.

The thing is, the variable value is set in a parameters.f90 The print statement is in a file H3.f90 that depends on parameters.f90 but it was not declared on the makefile.

After correcting:

h3.o: h3.f90 variables.f90 parameters.f90
        $(FC) -c h3.f90 

It all worked properly.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.