Abstract
A real-time GPU-driven point cloud visualization system built in Unity, capable of rendering and interactively exploring large-scale LiDAR sensor data from autonomous vehicle datasets.
Skills
Awards
Link

Contents
Introduction: Autonomous vehicle perception systems produce tens of millions of LiDAR returns per second, and visualizing that data in real time is a non-trivial engineering problem. This project is a self-contained visualization tool that addresses that challenge directly: it loads and plays back KITTI LiDAR sequences, renders perception outputs as labeled 3D bounding boxes, and offers four runtime color modes (distance, height, intensity, and heuristic semantic) to support different analysis tasks. The motivation was to build something that reflects the same architectural patterns used in production autonomy visualization tools, GPU-side culling, indirect rendering, sensor-frame-aware data handling, and background I/O, rather than a demo that works only at toy scale. Distance mode — Jet colormap, blue → red by radial distance from sensor
Points close to the vehicle appear blue/green, far points shift toward red. Useful for understanding sensor range and spotting occlusions.
Height mode — Jet colormap, blue → red by elevation
Low points (ground level) are blue, tall points (buildings, signs) are red. Good for checking ground-plane flatness or clearance.
Intensity mode — Turbo colormap, dark → bright by surface reflectivity
Measures how much laser light bounced back. Retroreflective surfaces like lane markings and road signs glow bright; asphalt and foliage are dark. Turbo was chosen over Jet here because it's perceptually uniform — fine intensity detail doesn't get lost in the rainbow banding Jet produces.
Semantic mode — Fixed color bands by height threshold
A heuristic (no ML involved) that assigns color by elevation band:
Below 0.12 m → gray (ground/road)
0.12–1.0 m → green (pedestrians)
1.0–2.6 m → orange (vehicles)
Above 2.6 m → blue (buildings/structures)
Technical Details:
The renderer achieves its performance through a fully GPU-driven pipeline: a compute shader (FrustumCulling.compute) runs one thread per point, applying height, radial-distance, distance-based LOD, and view-frustum rejection tests in parallel, then appends surviving point indices into an AppendStructuredBuffer. A second single-thread kernel writes the result count into an indirect argument buffer, allowing Graphics.DrawProceduralIndirect to draw the entire visible point cloud, up to 200,000 points, in a single draw call. Geometry is synthesized entirely on the GPU: the vertex shader reads visible indices and expands each point into a view-aligned quad via SV_VertexID, with no mesh assets involved. An identical indirect-draw approach handles 3D bounding box wireframes, keeping the full scene cost at exactly two draw calls and two compute dispatches per frame. Sensor data is loaded on a background thread with a lock-guarded handoff to the main thread for GPU upload, eliminating frame stalls during file I/O. The system correctly handles all coordinate frame conversions between Velodyne, KITTI camera label space, and Unity world space, a common source of subtle bugs in AV visualization tools.