PREEMPT_RT real-time kernel on Jetson Orin — build, latency, and thread setup

Q: Does Jetson Orin support PREEMPT_RT?

Yes. PREEMPT_RT patches are maintained for the Jetson Orin L4T kernel (5.15 for JetPack 6.x). NVIDIA has not released a pre-built RT kernel, so you must apply the PREEMPT_RT patchset to the L4T kernel source and build it yourself, or use the meta-tegra Yocto layer with LINUX_KERNEL_TYPE = preempt-rt. With PREEMPT_RT, worst-case latency on Orin drops from multi-millisecond to 50-200µs depending on interrupt load.

Q: What is the typical worst-case latency with PREEMPT_RT on Jetson Orin?

On a quiet system with an isolated CPU core, cyclictest reports worst-case latency of 50-120µs on Jetson Orin. Under load (USB, PCIe activity, CUDA workloads), worst-case can spike to 300-500µs if interrupts from those subsystems are not properly isolated or affinity-pinned away from the RT core. Real-world control loops at 1kHz (1ms period) run reliably with 200µs headroom.

Q: How is PREEMPT_RT different from standard Linux for real-time tasks?

The standard Linux kernel has non-preemptible sections — interrupt handlers, spinlocks, and certain kernel paths — where a high-priority RT task cannot preempt. PREEMPT_RT converts nearly all of these to preemptible mutexes, allowing RT threads to preempt almost anywhere in kernel code. This reduces worst-case latency from 5-30ms on standard kernels to 50-300µs on PREEMPT_RT.

Q: Is PREEMPT_RT on Jetson Orin suitable for hard real-time control?

PREEMPT_RT is soft real-time — it provides very low typical latency and much better worst-case latency than standard Linux, but it does not provide hard real-time guarantees. For hard real-time (guaranteed sub-100µs response regardless of system load), consider the Jailhouse hypervisor approach that runs a bare-metal RTOS on a dedicated CPU core alongside Linux.

Jetson Orin with PREEMPT_RT is capable of running 1kHz control loops in production. The PREEMPT_RT kernel makes nearly all kernel code preemptible, reducing worst-case interrupt latency from 5-30ms on standard L4T to 50-200µs on a properly configured system. The setup requires building a custom kernel (NVIDIA does not ship a pre-built RT kernel) and configuring CPU isolation for your real-time threads.

Key Insights

PREEMPT_RT requires building the L4T kernel yourself — NVIDIA does not ship a pre-built RT variant; apply the PREEMPT_RT patchset to the L4T 5.15 kernel source
CPU isolation is required for predictable latency — without isolcpus, the Linux scheduler will put housekeeping tasks on your RT core and spike latency
mlockall(MCL_CURRENT | MCL_FUTURE) is mandatory — without it, page faults in your RT thread can stall for milliseconds during memory access
Worst-case latency is what matters for control loops, not average — use cyclictest with --mlockall --smp --priority=99 to measure worst-case under realistic load
CUDA workloads on other cores can cause DMA-related IRQ latency spikes — if running CV inference alongside RT control, pin CUDA work to non-RT cores and IRQ-affinity the GPU interrupt away from the RT core

Applying PREEMPT_RT to L4T kernel

Via Yocto (meta-tegra, recommended for production)

# In your machine .conf or local.conf:
LINUX_KERNEL_TYPE = "preempt-rt"

# Or add the RT SCC to your kernel features:
KERNEL_FEATURES:append = " features/preempt-rt/preempt-rt.scc"

This uses the OE4T/meta-tegra layer’s built-in PREEMPT_RT support for the scarthgap (JetPack 6.x) branch.

Manual patch application

# Download L4T kernel sources
# From NVIDIA Jetson Linux page: Public_Sources.tbz2

tar xf Jetson_Linux_R36.x.0_aarch64.tbz2
source_sync.sh -k 5.15

# Find matching PREEMPT_RT patch
# kernel.org preempt-rt/patches/v5.15/ 
# Use the latest 5.15.x-rt patch matching your L4T minor version
wget https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.15/patch-5.15.71-rt51.patch.gz

# Apply
cd kernel/kernel-5.15
zcat ../patch-5.15.71-rt51.patch.gz | patch -p1

# Configure
make tegra_defconfig
scripts/config --enable PREEMPT_RT
scripts/config --disable PREEMPT_LAZY
scripts/config --enable CPU_FREQ_DEFAULT_GOV_PERFORMANCE

# Build
make -j$(nproc) Image dtbs modules

Verifying RT kernel is running

uname -r
# Should include "rt" in version string
# e.g., 5.15.71-tegra-rt51-v8+

# Check preemption model
grep PREEMPT_RT /boot/config-$(uname -r)
# CONFIG_PREEMPT_RT=y

# Verify no PREEMPT_LAZY (defeats RT preemptability)
grep PREEMPT_LAZY /boot/config-$(uname -r)
# Should be: # CONFIG_PREEMPT_LAZY is not set

CPU isolation setup

# Add to /boot/extlinux/extlinux.conf APPEND line:
# isolcpus=3 nohz_full=3 rcu_nocbs=3 irqaffinity=0-2

# After reboot, verify CPU 3 is isolated
cat /sys/devices/system/cpu/isolated
# 3

# Move all non-RT IRQs away from CPU 3
for i in /proc/irq/*/smp_affinity; do
    echo 7 > $i 2>/dev/null  # CPUs 0,1,2 only (bitmask 0b111 = 7)
done

Writing an RT control thread

#include <pthread.h>
#include <sched.h>
#include <sys/mman.h>
#include <time.h>
#include <stdio.h>

#define PERIOD_NS   1000000   /* 1ms = 1kHz control loop */
#define RT_PRIORITY 90        /* 1-99; higher = higher priority */

void *rt_control_loop(void *arg) {
    struct sched_param param = { .sched_priority = RT_PRIORITY };
    struct timespec next, now;

    /* Lock all memory — prevents page faults in RT context */
    mlockall(MCL_CURRENT | MCL_FUTURE);

    /* Set SCHED_FIFO — preempts all non-RT threads */
    pthread_setschedparam(pthread_self(), SCHED_FIFO, &param);

    /* Pin to isolated CPU core */
    cpu_set_t cpuset;
    CPU_ZERO(&cpuset);
    CPU_SET(3, &cpuset);
    pthread_setaffinity_np(pthread_self(), sizeof(cpuset), &cpuset);

    clock_gettime(CLOCK_MONOTONIC, &next);

    while (1) {
        /* ── Your 1ms control work here ── */
        read_sensors();
        compute_pid();
        write_actuators();
        /* ── End control work ── */

        /* Advance deadline by one period */
        next.tv_nsec += PERIOD_NS;
        if (next.tv_nsec >= 1000000000LL) {
            next.tv_nsec -= 1000000000LL;
            next.tv_sec++;
        }

        /* Sleep until absolute time — avoids drift accumulation */
        clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &next, NULL);
    }
    return NULL;
}

int main(void) {
    pthread_t rt_thread;
    pthread_attr_t attr;

    pthread_attr_init(&attr);
    pthread_attr_setschedpolicy(&attr, SCHED_FIFO);

    /* Stack size: preallocate to prevent page faults */
    pthread_attr_setstacksize(&attr, 1024 * 1024);  /* 1MB */

    pthread_create(&rt_thread, &attr, rt_control_loop, NULL);
    pthread_join(rt_thread, NULL);
    return 0;
}

Compile with:

gcc -O2 -o rt_control rt_control.c -lpthread -lrt

Measuring latency with cyclictest

# Install rt-tests
apt install rt-tests

# Run cyclictest — 1000Hz on isolated CPU 3, 60 second test
sudo cyclictest \
  --mlockall \
  --smp \
  --priority=99 \
  --interval=1000 \     # 1ms interval
  --distance=0 \
  --affinity=3 \        # isolated CPU
  --duration=60 \
  --histogram=200       # build latency histogram up to 200µs

# Good result on Jetson Orin with PREEMPT_RT:
# T: 0 (  3) P:99 I:1000 C:  60000 Min:    18 Act:   24 Avg:   21 Max:    87
# Worst-case: 87µs at 60000 samples

Latency comparison: standard vs PREEMPT_RT

Kernel	Idle worst-case	Under load worst-case	Suitable for
Standard L4T 5.15	2-5ms	15-30ms	Applications without hard timing
PREEMPT_RT L4T 5.15	50-120µs	150-500µs	Control loops ≥500Hz
Jailhouse RTOS core	10-30µs	10-30µs	Hard real-time <100µs

For Jailhouse hypervisor setup on Jetson when PREEMPT_RT latency is not sufficient, see Yocto BSP setup for Jetson Orin with meta-tegra for the Yocto build path used to enable both features.

FAQ

Does Jetson Orin support PREEMPT_RT?

Yes. Apply the PREEMPT_RT patchset to the L4T 5.15 kernel source, or use LINUX_KERNEL_TYPE = "preempt-rt" in meta-tegra. NVIDIA does not ship a pre-built RT kernel.

What is the typical worst-case latency with PREEMPT_RT on Jetson Orin?

50–120µs on a quiet system with an isolated core. Under load (USB, PCIe, CUDA activity), 150–500µs. Use cyclictest to measure your specific workload.

How is PREEMPT_RT different from standard Linux for real-time tasks?

PREEMPT_RT converts kernel spinlocks and interrupt handlers to preemptible mutexes, allowing RT threads to preempt nearly anywhere in kernel code. Worst-case latency drops from 5–30ms to 50–300µs.

Is PREEMPT_RT on Jetson Orin suitable for hard real-time control?

PREEMPT_RT is soft real-time — excellent typical latency and much better worst-case than standard Linux, but no hard guarantees. For hard real-time (<100µs guaranteed), use Jailhouse with a dedicated bare-metal RTOS core.