Jetson Orin running cyclictest latency histogram showing sub-100us worst-case latency with PREEMPT_RT
jetsonorinpreempt-rtreal-timelinuxlatencycyclictestembedded linux

PREEMPT_RT real-time kernel on Jetson Orin — build, latency, and thread setup

Aaron Angulo ·

Jetson Orin with PREEMPT_RT is capable of running 1kHz control loops in production. The PREEMPT_RT kernel makes nearly all kernel code preemptible, reducing worst-case interrupt latency from 5-30ms on standard L4T to 50-200µs on a properly configured system. The setup requires building a custom kernel (NVIDIA does not ship a pre-built RT kernel) and configuring CPU isolation for your real-time threads.

Key Insights

  • PREEMPT_RT requires building the L4T kernel yourself — NVIDIA does not ship a pre-built RT variant; apply the PREEMPT_RT patchset to the L4T 5.15 kernel source
  • CPU isolation is required for predictable latency — without isolcpus, the Linux scheduler will put housekeeping tasks on your RT core and spike latency
  • mlockall(MCL_CURRENT | MCL_FUTURE) is mandatory — without it, page faults in your RT thread can stall for milliseconds during memory access
  • Worst-case latency is what matters for control loops, not average — use cyclictest with --mlockall --smp --priority=99 to measure worst-case under realistic load
  • CUDA workloads on other cores can cause DMA-related IRQ latency spikes — if running CV inference alongside RT control, pin CUDA work to non-RT cores and IRQ-affinity the GPU interrupt away from the RT core

Applying PREEMPT_RT to L4T kernel

# In your machine .conf or local.conf:
LINUX_KERNEL_TYPE = "preempt-rt"

# Or add the RT SCC to your kernel features:
KERNEL_FEATURES:append = " features/preempt-rt/preempt-rt.scc"

This uses the OE4T/meta-tegra layer’s built-in PREEMPT_RT support for the scarthgap (JetPack 6.x) branch.

Manual patch application

# Download L4T kernel sources
# From NVIDIA Jetson Linux page: Public_Sources.tbz2

tar xf Jetson_Linux_R36.x.0_aarch64.tbz2
source_sync.sh -k 5.15

# Find matching PREEMPT_RT patch
# kernel.org preempt-rt/patches/v5.15/ 
# Use the latest 5.15.x-rt patch matching your L4T minor version
wget https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.15/patch-5.15.71-rt51.patch.gz

# Apply
cd kernel/kernel-5.15
zcat ../patch-5.15.71-rt51.patch.gz | patch -p1

# Configure
make tegra_defconfig
scripts/config --enable PREEMPT_RT
scripts/config --disable PREEMPT_LAZY
scripts/config --enable CPU_FREQ_DEFAULT_GOV_PERFORMANCE

# Build
make -j$(nproc) Image dtbs modules

Verifying RT kernel is running

uname -r
# Should include "rt" in version string
# e.g., 5.15.71-tegra-rt51-v8+

# Check preemption model
grep PREEMPT_RT /boot/config-$(uname -r)
# CONFIG_PREEMPT_RT=y

# Verify no PREEMPT_LAZY (defeats RT preemptability)
grep PREEMPT_LAZY /boot/config-$(uname -r)
# Should be: # CONFIG_PREEMPT_LAZY is not set

CPU isolation setup

# Add to /boot/extlinux/extlinux.conf APPEND line:
# isolcpus=3 nohz_full=3 rcu_nocbs=3 irqaffinity=0-2

# After reboot, verify CPU 3 is isolated
cat /sys/devices/system/cpu/isolated
# 3

# Move all non-RT IRQs away from CPU 3
for i in /proc/irq/*/smp_affinity; do
    echo 7 > $i 2>/dev/null  # CPUs 0,1,2 only (bitmask 0b111 = 7)
done

Writing an RT control thread

#include <pthread.h>
#include <sched.h>
#include <sys/mman.h>
#include <time.h>
#include <stdio.h>

#define PERIOD_NS   1000000   /* 1ms = 1kHz control loop */
#define RT_PRIORITY 90        /* 1-99; higher = higher priority */

void *rt_control_loop(void *arg) {
    struct sched_param param = { .sched_priority = RT_PRIORITY };
    struct timespec next, now;

    /* Lock all memory — prevents page faults in RT context */
    mlockall(MCL_CURRENT | MCL_FUTURE);

    /* Set SCHED_FIFO — preempts all non-RT threads */
    pthread_setschedparam(pthread_self(), SCHED_FIFO, &param);

    /* Pin to isolated CPU core */
    cpu_set_t cpuset;
    CPU_ZERO(&cpuset);
    CPU_SET(3, &cpuset);
    pthread_setaffinity_np(pthread_self(), sizeof(cpuset), &cpuset);

    clock_gettime(CLOCK_MONOTONIC, &next);

    while (1) {
        /* ── Your 1ms control work here ── */
        read_sensors();
        compute_pid();
        write_actuators();
        /* ── End control work ── */

        /* Advance deadline by one period */
        next.tv_nsec += PERIOD_NS;
        if (next.tv_nsec >= 1000000000LL) {
            next.tv_nsec -= 1000000000LL;
            next.tv_sec++;
        }

        /* Sleep until absolute time — avoids drift accumulation */
        clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &next, NULL);
    }
    return NULL;
}

int main(void) {
    pthread_t rt_thread;
    pthread_attr_t attr;

    pthread_attr_init(&attr);
    pthread_attr_setschedpolicy(&attr, SCHED_FIFO);

    /* Stack size: preallocate to prevent page faults */
    pthread_attr_setstacksize(&attr, 1024 * 1024);  /* 1MB */

    pthread_create(&rt_thread, &attr, rt_control_loop, NULL);
    pthread_join(rt_thread, NULL);
    return 0;
}

Compile with:

gcc -O2 -o rt_control rt_control.c -lpthread -lrt

Measuring latency with cyclictest

# Install rt-tests
apt install rt-tests

# Run cyclictest — 1000Hz on isolated CPU 3, 60 second test
sudo cyclictest \
  --mlockall \
  --smp \
  --priority=99 \
  --interval=1000 \     # 1ms interval
  --distance=0 \
  --affinity=3 \        # isolated CPU
  --duration=60 \
  --histogram=200       # build latency histogram up to 200µs

# Good result on Jetson Orin with PREEMPT_RT:
# T: 0 (  3) P:99 I:1000 C:  60000 Min:    18 Act:   24 Avg:   21 Max:    87
# Worst-case: 87µs at 60000 samples

Latency comparison: standard vs PREEMPT_RT

KernelIdle worst-caseUnder load worst-caseSuitable for
Standard L4T 5.152-5ms15-30msApplications without hard timing
PREEMPT_RT L4T 5.1550-120µs150-500µsControl loops ≥500Hz
Jailhouse RTOS core10-30µs10-30µsHard real-time <100µs

For Jailhouse hypervisor setup on Jetson when PREEMPT_RT latency is not sufficient, see Yocto BSP setup for Jetson Orin with meta-tegra for the Yocto build path used to enable both features.

FAQ

Does Jetson Orin support PREEMPT_RT?

Yes. Apply the PREEMPT_RT patchset to the L4T 5.15 kernel source, or use LINUX_KERNEL_TYPE = "preempt-rt" in meta-tegra. NVIDIA does not ship a pre-built RT kernel.

What is the typical worst-case latency with PREEMPT_RT on Jetson Orin?

50–120µs on a quiet system with an isolated core. Under load (USB, PCIe, CUDA activity), 150–500µs. Use cyclictest to measure your specific workload.

How is PREEMPT_RT different from standard Linux for real-time tasks?

PREEMPT_RT converts kernel spinlocks and interrupt handlers to preemptible mutexes, allowing RT threads to preempt nearly anywhere in kernel code. Worst-case latency drops from 5–30ms to 50–300µs.

Is PREEMPT_RT on Jetson Orin suitable for hard real-time control?

PREEMPT_RT is soft real-time — excellent typical latency and much better worst-case than standard Linux, but no hard guarantees. For hard real-time (<100µs guaranteed), use Jailhouse with a dedicated bare-metal RTOS core.


NVIDIA Jetson Expert Support

Stuck on a Jetson bring-up?

We've debugged this failure mode before. BSP, device tree, camera pipelines, OTA, most blockers clear in the first session. No long retainers. No guessing.

Frequently Asked Questions

Does Jetson Orin support PREEMPT_RT?

Yes. PREEMPT_RT patches are maintained for the Jetson Orin L4T kernel (5.15 for JetPack 6.x). NVIDIA has not released a pre-built RT kernel, so you must apply the PREEMPT_RT patchset to the L4T kernel source and build it yourself, or use the meta-tegra Yocto layer with LINUX_KERNEL_TYPE = preempt-rt. With PREEMPT_RT, worst-case latency on Orin drops from multi-millisecond to 50-200µs depending on interrupt load.

What is the typical worst-case latency with PREEMPT_RT on Jetson Orin?

On a quiet system with an isolated CPU core, cyclictest reports worst-case latency of 50-120µs on Jetson Orin. Under load (USB, PCIe activity, CUDA workloads), worst-case can spike to 300-500µs if interrupts from those subsystems are not properly isolated or affinity-pinned away from the RT core. Real-world control loops at 1kHz (1ms period) run reliably with 200µs headroom.

How is PREEMPT_RT different from standard Linux for real-time tasks?

The standard Linux kernel has non-preemptible sections — interrupt handlers, spinlocks, and certain kernel paths — where a high-priority RT task cannot preempt. PREEMPT_RT converts nearly all of these to preemptible mutexes, allowing RT threads to preempt almost anywhere in kernel code. This reduces worst-case latency from 5-30ms on standard kernels to 50-300µs on PREEMPT_RT.

Is PREEMPT_RT on Jetson Orin suitable for hard real-time control?

PREEMPT_RT is soft real-time — it provides very low typical latency and much better worst-case latency than standard Linux, but it does not provide hard real-time guarantees. For hard real-time (guaranteed sub-100µs response regardless of system load), consider the Jailhouse hypervisor approach that runs a bare-metal RTOS on a dedicated CPU core alongside Linux.

Aarón Angulo, Co-Founder & CEO at ProventusNova

Written by

Aarón Angulo

Co-Founder & CEO · ProventusNova

Obsessed with client outcomes. Aarón ensures every engagement delivers real results, on time, on scope, no exceptions.

Connect on LinkedIn