0

So I have this code that takes care of command acknowledgment from remote computers, sometimes (like once in 14 days or something) the following line throws a null reference exception:

computer.ProcessCommandAcknowledgment( commandType );

What really bugs me is that I check for a null reference before it, so I have no idea whats going on. Here's the full method for what its worth:

    public static void __CommandAck( PacketReader reader, SocketContext context )
    {
        string commandAck = reader.ReadString();

        Type commandType = Type.GetType( commandAck );

        Computer computer = context.Client as Computer;

        if (computer == null)
        {
            Console.WriteLine("Client already disposed. Couldn't complete operation");
        }
        else
        {
            computer.ProcessCommandAcknowledgment( commandType );
        }
    }

Any clues?

Edit: ProcessCommandAcknowledgment:

    public void ProcessCommandAcknowledgment( Type ackType )
    {
        if( m_CurrentCommand.GetType() == ackType )
        {
            m_CurrentCommand.Finish();
        }
    }
4
  • Can you specify 1) multiple threads (yes/no), 2) CLR version and 3) processor architecture Commented Dec 27, 2008 at 1:14
  • Single threaded / 2.0.50727.1433 / x86 Commented Dec 27, 2008 at 1:18
  • Can you post the signature(s) for ProcessCommandAcknwoldgement? Commented Dec 27, 2008 at 1:27
  • Can you post a stack trace of the exception? Commented Dec 27, 2008 at 3:17

8 Answers 8

4

Based on the information you gave, it certainly appears impossible for a null ref to occur at that location. So the next question is "How do you know that the particular line is creating the NullReferenceException?" Are you using the debugger or stack trace information? Are you checking a retail or debug version of the code?

If it's the debugger, various setting combinations which can essentially cause the debugger to appear to report the NullRef in a different place. The main on that would do that is the Just My Code setting.

In my experience, I've found the most reliable way to determine the line an exception actually occurs on is to ...

  1. Turn off JMC
  2. Compile with Debug
  3. Debugger -> Settings -> Break on Throw CLR exceptions.
  4. Check the StackTrace property in the debugger window
Sign up to request clarification or add additional context in comments.

1 Comment

I have a PDB detailed stack trace. The information is 100% correct.
2

I would bet money that there's a problem with your TCP framing code (if you have any!)

"PacketReader" perhaps suggests that you don't. Because, technically, it would be called "FrameReader" or something similar if you did.

If the two PC's involved are on a local LAN or something then it would probably explain the 14 days interval. If you tried this over the Internet I bet your error frequency would be much more common especially if the WAN bandwidth was contended.

1 Comment

Invalid packets are discarded and clients who sent them disconnected. First the crc of the payload is checked, then the packets are decompressed and finally decrypted. If any of these steps fail the client is disconnected.
2

Is it possible that ReadString() is returning null? This would cause GetType to fail. Perhaps you've received an empty packet? Alternatively, the string may not match a type and thus commandType would be null when used later.

EDIT: Have you checked that m_CurrentCommand is not null when you invoke ProcessCommandAcknowledgment?

3 Comments

Nah, the nullref is thrown at the line I highlighted, the packets are 100% sure to be valid and complete. (hash control + they're encrypted).
Besides that, the command acknowledge method can never throw a nullref as its comparing the given type with the type bound to the computer, so actually comparison to null is valid.
No part of code ever sets it to null. In the ctor it defaults to Command.Invalid.
1

What are the other thread(s) doing?

Edit: You mention that the server is single threaded, but another comment suggests that this portion is single threaded. If that's the case, you could still have concurrency issues.

Bottom line here, I think, is that you either have a multi-thread issue or a CLR bug. You can guess which I think is more likely.

5 Comments

Surely there are dozen of other threads doing something useful, the netcode, on the other hand is single-threaded. Besides, I'm working with a local variable that can never be null at that point, unless GC'ed or whatever.
@arul - Is your backing field declared volatile? Are you taking locks when you set that field? There can be slightly less obvious concurrency issues.
It's a local variable, which exists on stack and can't be declared as volatile.
The backing field for Context.Client may need to be volatile.
Declaring it as volatile probably solved the problem, I mean, it didnt happen since then.
1

If you have optimizations turned on, it's likely pointing you to a very wrong place where it actually happens.

Something similar happened to me a few years back.

1 Comment

Well, it's been working with the very same compiler settings reliably. Always pointed to the location where was the error. Only in this particular case there seems to be a problem.
1

Or else a possible thread race somewhere where context gets set to null by another thread. That would also explain the uncommonness of the error.

Comments

1

Okay, ther are really only a few possibilities.

  1. Somehow your computer reference is being tromped by the time you call that routine.

  2. Something under the call is throwing the null pointer dereference error but it's being detected at that line.

Looking at it, I'm very suspicious the stack is getting corrupted, causing your computer automatic to get mangled. Check the subroutine/method/function calls around the one you have trouble with; in particular, check that what you're making into a "Computer" item really is the type you expect.

3 Comments

Thank you, any clues how to debug it?
These can be hard to debug. I'd try two things: (1) catch the null reference exception at that location, and get as much information as you can when it happens; (2) grep the code for calls to that routine, and examine the code at each one.
Oh, in that exception handler, have a look at what Console.client is really giving you.
1

computer.ProcessCommandAcknowledgment( commandType );

Do you have debugging symbols to be able to step into this?

The null ref exception could be thrown by ProcessCommandAcknowledgement, and bubble up.

1 Comment

Yep. Besides that, ProcessCommandAcknowledgement accepts null command.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.