PREEMPT_RT real-time kernel on Jetson Orin — build, latency, and thread setup
Jetson Orin with PREEMPT_RT is capable of running 1kHz control loops in production. The PREEMPT_RT kernel makes nearly all kernel code preemptible, reducing worst-case interrupt latency from 5-30ms on standard L4T to 50-200µs on a properly configured system. The setup requires building a custom kernel (NVIDIA does not ship a pre-built RT kernel) and configuring CPU isolation for your real-time threads.
Key Insights
- PREEMPT_RT requires building the L4T kernel yourself — NVIDIA does not ship a pre-built RT variant; apply the PREEMPT_RT patchset to the L4T 5.15 kernel source
- CPU isolation is required for predictable latency — without
isolcpus, the Linux scheduler will put housekeeping tasks on your RT core and spike latency mlockall(MCL_CURRENT | MCL_FUTURE)is mandatory — without it, page faults in your RT thread can stall for milliseconds during memory access- Worst-case latency is what matters for control loops, not average — use
cyclictestwith--mlockall --smp --priority=99to measure worst-case under realistic load - CUDA workloads on other cores can cause DMA-related IRQ latency spikes — if running CV inference alongside RT control, pin CUDA work to non-RT cores and IRQ-affinity the GPU interrupt away from the RT core
Applying PREEMPT_RT to L4T kernel
Via Yocto (meta-tegra, recommended for production)
# In your machine .conf or local.conf:
LINUX_KERNEL_TYPE = "preempt-rt"
# Or add the RT SCC to your kernel features:
KERNEL_FEATURES:append = " features/preempt-rt/preempt-rt.scc"
This uses the OE4T/meta-tegra layer’s built-in PREEMPT_RT support for the scarthgap (JetPack 6.x) branch.
Manual patch application
# Download L4T kernel sources
# From NVIDIA Jetson Linux page: Public_Sources.tbz2
tar xf Jetson_Linux_R36.x.0_aarch64.tbz2
source_sync.sh -k 5.15
# Find matching PREEMPT_RT patch
# kernel.org preempt-rt/patches/v5.15/
# Use the latest 5.15.x-rt patch matching your L4T minor version
wget https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.15/patch-5.15.71-rt51.patch.gz
# Apply
cd kernel/kernel-5.15
zcat ../patch-5.15.71-rt51.patch.gz | patch -p1
# Configure
make tegra_defconfig
scripts/config --enable PREEMPT_RT
scripts/config --disable PREEMPT_LAZY
scripts/config --enable CPU_FREQ_DEFAULT_GOV_PERFORMANCE
# Build
make -j$(nproc) Image dtbs modules
Verifying RT kernel is running
uname -r
# Should include "rt" in version string
# e.g., 5.15.71-tegra-rt51-v8+
# Check preemption model
grep PREEMPT_RT /boot/config-$(uname -r)
# CONFIG_PREEMPT_RT=y
# Verify no PREEMPT_LAZY (defeats RT preemptability)
grep PREEMPT_LAZY /boot/config-$(uname -r)
# Should be: # CONFIG_PREEMPT_LAZY is not set
CPU isolation setup
# Add to /boot/extlinux/extlinux.conf APPEND line:
# isolcpus=3 nohz_full=3 rcu_nocbs=3 irqaffinity=0-2
# After reboot, verify CPU 3 is isolated
cat /sys/devices/system/cpu/isolated
# 3
# Move all non-RT IRQs away from CPU 3
for i in /proc/irq/*/smp_affinity; do
echo 7 > $i 2>/dev/null # CPUs 0,1,2 only (bitmask 0b111 = 7)
done
Writing an RT control thread
#include <pthread.h>
#include <sched.h>
#include <sys/mman.h>
#include <time.h>
#include <stdio.h>
#define PERIOD_NS 1000000 /* 1ms = 1kHz control loop */
#define RT_PRIORITY 90 /* 1-99; higher = higher priority */
void *rt_control_loop(void *arg) {
struct sched_param param = { .sched_priority = RT_PRIORITY };
struct timespec next, now;
/* Lock all memory — prevents page faults in RT context */
mlockall(MCL_CURRENT | MCL_FUTURE);
/* Set SCHED_FIFO — preempts all non-RT threads */
pthread_setschedparam(pthread_self(), SCHED_FIFO, ¶m);
/* Pin to isolated CPU core */
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(3, &cpuset);
pthread_setaffinity_np(pthread_self(), sizeof(cpuset), &cpuset);
clock_gettime(CLOCK_MONOTONIC, &next);
while (1) {
/* ── Your 1ms control work here ── */
read_sensors();
compute_pid();
write_actuators();
/* ── End control work ── */
/* Advance deadline by one period */
next.tv_nsec += PERIOD_NS;
if (next.tv_nsec >= 1000000000LL) {
next.tv_nsec -= 1000000000LL;
next.tv_sec++;
}
/* Sleep until absolute time — avoids drift accumulation */
clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &next, NULL);
}
return NULL;
}
int main(void) {
pthread_t rt_thread;
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setschedpolicy(&attr, SCHED_FIFO);
/* Stack size: preallocate to prevent page faults */
pthread_attr_setstacksize(&attr, 1024 * 1024); /* 1MB */
pthread_create(&rt_thread, &attr, rt_control_loop, NULL);
pthread_join(rt_thread, NULL);
return 0;
}
Compile with:
gcc -O2 -o rt_control rt_control.c -lpthread -lrt
Measuring latency with cyclictest
# Install rt-tests
apt install rt-tests
# Run cyclictest — 1000Hz on isolated CPU 3, 60 second test
sudo cyclictest \
--mlockall \
--smp \
--priority=99 \
--interval=1000 \ # 1ms interval
--distance=0 \
--affinity=3 \ # isolated CPU
--duration=60 \
--histogram=200 # build latency histogram up to 200µs
# Good result on Jetson Orin with PREEMPT_RT:
# T: 0 ( 3) P:99 I:1000 C: 60000 Min: 18 Act: 24 Avg: 21 Max: 87
# Worst-case: 87µs at 60000 samples
Latency comparison: standard vs PREEMPT_RT
| Kernel | Idle worst-case | Under load worst-case | Suitable for |
|---|---|---|---|
| Standard L4T 5.15 | 2-5ms | 15-30ms | Applications without hard timing |
| PREEMPT_RT L4T 5.15 | 50-120µs | 150-500µs | Control loops ≥500Hz |
| Jailhouse RTOS core | 10-30µs | 10-30µs | Hard real-time <100µs |
For Jailhouse hypervisor setup on Jetson when PREEMPT_RT latency is not sufficient, see Yocto BSP setup for Jetson Orin with meta-tegra for the Yocto build path used to enable both features.
FAQ
Does Jetson Orin support PREEMPT_RT?
Yes. Apply the PREEMPT_RT patchset to the L4T 5.15 kernel source, or use LINUX_KERNEL_TYPE = "preempt-rt" in meta-tegra. NVIDIA does not ship a pre-built RT kernel.
What is the typical worst-case latency with PREEMPT_RT on Jetson Orin?
50–120µs on a quiet system with an isolated core. Under load (USB, PCIe, CUDA activity), 150–500µs. Use cyclictest to measure your specific workload.
How is PREEMPT_RT different from standard Linux for real-time tasks?
PREEMPT_RT converts kernel spinlocks and interrupt handlers to preemptible mutexes, allowing RT threads to preempt nearly anywhere in kernel code. Worst-case latency drops from 5–30ms to 50–300µs.
Is PREEMPT_RT on Jetson Orin suitable for hard real-time control?
PREEMPT_RT is soft real-time — excellent typical latency and much better worst-case than standard Linux, but no hard guarantees. For hard real-time (<100µs guaranteed), use Jailhouse with a dedicated bare-metal RTOS core.
Relevant Services
NVIDIA Jetson Expert Support
Stuck on a Jetson bring-up?
We've debugged this failure mode before. BSP, device tree, camera pipelines, OTA, most blockers clear in the first session. No long retainers. No guessing.
Frequently Asked Questions
Does Jetson Orin support PREEMPT_RT?
Yes. PREEMPT_RT patches are maintained for the Jetson Orin L4T kernel (5.15 for JetPack 6.x). NVIDIA has not released a pre-built RT kernel, so you must apply the PREEMPT_RT patchset to the L4T kernel source and build it yourself, or use the meta-tegra Yocto layer with LINUX_KERNEL_TYPE = preempt-rt. With PREEMPT_RT, worst-case latency on Orin drops from multi-millisecond to 50-200µs depending on interrupt load.
What is the typical worst-case latency with PREEMPT_RT on Jetson Orin?
On a quiet system with an isolated CPU core, cyclictest reports worst-case latency of 50-120µs on Jetson Orin. Under load (USB, PCIe activity, CUDA workloads), worst-case can spike to 300-500µs if interrupts from those subsystems are not properly isolated or affinity-pinned away from the RT core. Real-world control loops at 1kHz (1ms period) run reliably with 200µs headroom.
How is PREEMPT_RT different from standard Linux for real-time tasks?
The standard Linux kernel has non-preemptible sections — interrupt handlers, spinlocks, and certain kernel paths — where a high-priority RT task cannot preempt. PREEMPT_RT converts nearly all of these to preemptible mutexes, allowing RT threads to preempt almost anywhere in kernel code. This reduces worst-case latency from 5-30ms on standard kernels to 50-300µs on PREEMPT_RT.
Is PREEMPT_RT on Jetson Orin suitable for hard real-time control?
PREEMPT_RT is soft real-time — it provides very low typical latency and much better worst-case latency than standard Linux, but it does not provide hard real-time guarantees. For hard real-time (guaranteed sub-100µs response regardless of system load), consider the Jailhouse hypervisor approach that runs a bare-metal RTOS on a dedicated CPU core alongside Linux.
Written by
Aarón AnguloCo-Founder & CEO · ProventusNova
Obsessed with client outcomes. Aarón ensures every engagement delivers real results, on time, on scope, no exceptions.
Connect on LinkedInRelated Articles
GMSL YUV422 capture and FORCE_FE errors on Jetson Orin — debug guide
Debug GMSL YUV422 capture issues on Jetson Orin — FORCE_FE decoder config, partial frame faults, and MAX9295/MAX9296 YUV format setup.
Jetson camera works with v4l2-ctl but fails to launch argus_camera — debug guide
Why your Jetson camera works with v4l2-ctl but argus_camera fails — tegra-camera DT node issues, sensor mode tables, and the V4L2-to-Argus fault path.
nvcompositor vs parallel GStreamer pipelines on Jetson Orin — when each is slower
When to use nvcompositor vs parallel GStreamer pipelines on Jetson Orin, why compositor is slower, and how to choose the right path for your workload.
Jetson Orin rootfs-ab slot switches to B unexpectedly — nvbootctrl debug
Why Jetson Orin switches to the B rootfs partition on reboot, how nvbootctrl works, and how to prevent unwanted slot switches with retry counters.