Friday, November 24, 2023

will sending `kill -11` to java process raises a NullPointerException?

November 24, 2023 hotspot, java, jvm, linux No comments

Issue

For example, the HotSpot JVM implement NullPointer detection by catching SIGSEGV signal. So if we manually generate a SIGSEGV from external, will that also be recognized as NullPointerException in some circumstances ?

Solution

Will sending kill -11 to java process raises a NullPointerException?

It should not: a NullPointerException is a specific exception that occurs when an application tries to use an object reference that has the null value.

Yet, from JavaSE 17 / Troubleshooting guide / Handle Signals and Exceptions

The Java HotSpot VM installs signal handlers to implement various features and to handle fatal error conditions.

For example, in an optimization to avoid explicit null checks in cases where java.lang.NullPointerException will be thrown rarely, the SIGSEGV signal is caught and handled, and the NullPointerException is thrown.

In general, there are two categories where signal/traps happen:

When signals are expected and handled, like implicit null-handling. Another example is the safepoint polling mechanism, which protects a page in memory when a safepoint is required. Any thread that accesses that page causes a SIGSEGV, which results in the execution of a stub that brings the thread to a safepoint.

Unexpected signals. That includes a SIGSEGV when executing in VM code, Java Native Interface (JNI) code, or native code. In these cases, the signal is unexpected, so fatal error handling is invoked to create the error log and terminate the process.

That approach allows the JVM to optimize performance by reducing the overhead of explicit null checks in the code, relying instead on the operating system's memory protection mechanisms to detect access to null references. When such access occurs, the operating system generates a SIGSEGV signal, which the JVM then interprets as an attempt to dereference a null pointer, leading to the throwing of a NullPointerException.

However, it is important to note that this is an internal mechanism of the JVM and is distinct from externally generated SIGSEGV signals, such as those sent using the kill command. External SIGSEGV signals are generally used to indicate serious errors, including invalid memory access, and are more likely to result in a JVM crash or core dump rather than a NullPointerException.

+---------------------+         +-----------------------------------+
| External Process    |         | Java Process running on HotSpot   |
| sending SIGSEGV     | ------> | JVM                               |
| (kill -11)          |         | Likely JVM Crash or Core Dump     |
+---------------------+         +-----------------------------------+

Is the JVM always capable of detecting whether an external SIGSEGV is an external SIGSEGV or is it possible to confuse an external SIGSEGV for a null access when it happens at a specific time, i.e. when a potential null access is expected?

Again, it should not, but this is an implementation-specific aspect of JVM behavior.
That means the likelihood of such confusion happening in practice may vary depending on the JVM version, the specific code being executed, and the state of the JVM at the time of the signal.

See for instance "How does the JVM know when to throw a NullPointerException"

The JVM could implement the null check using virtual memory hardware. The JVM arranges that page zero in its virtual address space is mapped to a page that is unreadable + unwriteable.

Since null is represented as zero, when Java code tries to dereference null this will try to access a non-addressible page and will lead to the OS delivering a "segfault" signal to the JVM.

The JVM's segfault signal handler could trap this, figure out where the code was executing, and create and throw an NPE on the stack of the appropriate thread.

In that scenario, it should be easy to distinguish a trapped signal from within the code execution, from a received signal from the OS.

Also: "Can a SIGSEGV in Java not crash the JVM?"

There are definitely scenarios where the JVM's SIGSEGV signal handler may turn the SIGSEGV event into a Java exception.
You will only get a JVM hard crash if that cannot happen; e.g. if the thread that triggered the SIGSEGV was executing code in a native library when the event happened.

For instance:

HotSpot JVM deliberately generates SIGSEGV at startup to check certain CPU features. There is no switch to turn it off. I suggest skipping SIGSEGV in gdb altogether, because JVM uses it for its own purpose in many cases.

What if the stack happens to locate at accessing an address when the SIGSEGV is triggered externally?

The hotspot had a major refactoring around signal handling in JDK-8255711, resulting in commit dd8e4ff.

The current code is os_linux_x86.cpp#PosixSignals::pd_hotspot_signal_handler

  // decide if this trap can be handled by a stub
  address stub = nullptr;

  address pc          = nullptr;

  //%note os_trap_1
  if (info != nullptr && uc != nullptr && thread != nullptr) {
    pc = (address) os::Posix::ucontext_get_pc(uc);

    if (sig == SIGSEGV && info->si_addr == 0 && info->si_code == SI_KERNEL) {
      // An irrecoverable SI_KERNEL SIGSEGV has occurred.
      // It's likely caused by dereferencing an address larger than TASK_SIZE.
      return false;
    }

    // Handle ALL stack overflow variations here
    if (sig == SIGSEGV) {
      address addr = (address) info->si_addr;

      // check if fault address is within thread stack
      if (thread->is_in_full_stack(addr)) {
        // stack overflow
        if (os::Posix::handle_stack_overflow(thread, addr, pc, uc, &stub)) {
          return true; // continue
        }
      }
    }

    if ((sig == SIGSEGV) && VM_Version::is_cpuinfo_segv_addr(pc)) {
      // Verify that OS save/restore AVX registers.
      stub = VM_Version::cpuinfo_cont_addr();
    }

    if (thread->thread_state() == _thread_in_Java) {
      // Java thread running in Java code => find exception handler if any
      // a fault inside compiled code, the interpreter, or a stub

      if (sig == SIGSEGV && SafepointMechanism::is_poll_address((address)info->si_addr)) {
        stub = SharedRuntime::get_poll_stub(pc);
      } else if (sig == SIGBUS /* && info->si_code == BUS_OBJERR */) {
        // BugId 4454115: A read from a MappedByteBuffer can fault
        // here if the underlying file has been truncated.
        // Do not crash the VM in such a case.
        CodeBlob* cb = CodeCache::find_blob(pc);
        CompiledMethod* nm = (cb != nullptr) ? cb->as_compiled_method_or_null() : nullptr;
        bool is_unsafe_arraycopy = thread->doing_unsafe_access() && UnsafeCopyMemory::contains_pc(pc);
        if ((nm != nullptr && nm->has_unsafe_access()) || is_unsafe_arraycopy) {
          address next_pc = Assembler::locate_next_instruction(pc);
          if (is_unsafe_arraycopy) {
            next_pc = UnsafeCopyMemory::page_error_continue_pc(pc);
          }
          stub = SharedRuntime::handle_unsafe_access(thread, next_pc);
        }
      }
      else

#ifdef AMD64
      if (sig == SIGFPE  &&
          (info->si_code == FPE_INTDIV || info->si_code == FPE_FLTDIV)) {
        stub =
          SharedRuntime::
          continuation_for_implicit_exception(thread,
                                              pc,
                                              SharedRuntime::
                                              IMPLICIT_DIVIDE_BY_ZERO);
#else
      if (sig == SIGFPE /* && info->si_code == FPE_INTDIV */) {
        // HACK: si_code does not work on linux 2.2.12-20!!!
        int op = pc[0];
        if (op == 0xDB) {
          // FIST
          // TODO: The encoding of D2I in x86_32.ad can cause an exception
          // prior to the fist instruction if there was an invalid operation
          // pending. We want to dismiss that exception. From the win_32
          // side it also seems that if it really was the fist causing
          // the exception that we do the d2i by hand with different
          // rounding. Seems kind of weird.
          // NOTE: that we take the exception at the NEXT floating point instruction.
          assert(pc[0] == 0xDB, "not a FIST opcode");
          assert(pc[1] == 0x14, "not a FIST opcode");
          assert(pc[2] == 0x24, "not a FIST opcode");
          return true;
        } else if (op == 0xF7) {
          // IDIV
          stub = SharedRuntime::continuation_for_implicit_exception(thread, pc, SharedRuntime::IMPLICIT_DIVIDE_BY_ZERO);
        } else {
          // TODO: handle more cases if we are using other x86 instructions
          //   that can generate SIGFPE signal on linux.
          tty->print_cr("unknown opcode 0x%X with SIGFPE.", op);
          fatal("please update this code.");
        }
#endif // AMD64
      } else if (sig == SIGSEGV &&
                 MacroAssembler::uses_implicit_null_check(info->si_addr)) {
          // Determination of interpreter/vtable stub/compiled code null exception
          stub = SharedRuntime::continuation_for_implicit_exception(thread, pc, SharedRuntime::IMPLICIT_NULL);
      }
    } else if ((thread->thread_state() == _thread_in_vm ||
                thread->thread_state() == _thread_in_native) &&
               (sig == SIGBUS && /* info->si_code == BUS_OBJERR && */
               thread->doing_unsafe_access())) {
        address next_pc = Assembler::locate_next_instruction(pc);
        if (UnsafeCopyMemory::contains_pc(pc)) {
          next_pc = UnsafeCopyMemory::page_error_continue_pc(pc);
        }
        stub = SharedRuntime::handle_unsafe_access(thread, next_pc);
    }

    // jni_fast_Get<Primitive>Field can trap at certain pc's if a GC kicks in
    // and the heap gets shrunk before the field access.
    if ((sig == SIGSEGV) || (sig == SIGBUS)) {
      address addr = JNI_FastGetField::find_slowcase_pc(pc);
      if (addr != (address)-1) {
        stub = addr;
      }
    }
  }

The JVM uses various checks to determine the context of a SIGSEGV signal. However, I do not see a straightforward mechanism to distinguish an externally sent SIGSEGV from one internally generated due to a null reference access.

The signal handler examines the execution context, including the program counter and the stack, to infer the cause of the SIGSEGV. In case of a null reference, it looks for specific patterns that suggest a null pointer exception. But if an external SIGSEGV happens to coincide precisely with a situation where the JVM's execution state resembles that of a null pointer access, distinguishing between the two can be challenging.

However, such a scenario is relatively unlikely due to the level of precision required in timing.

Answered By - VonC

This Answer collected from stackoverflow and tested by AndroidBugFix community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, November 24, 2023

will sending `kill -11` to java process raises a NullPointerException?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels