Images and Videos

These visuals highlight the simulation environment, testing process, and immersive design elements behind the roadside detection and tracking research project. They supplement the research poster and offer additional insight into the project’s development.

Visual reference of how vehicles merge upstream of a work zone under controlled simulation.

A drive-through of the custom freeway work zone, highlighting the simulation layout from a first-person perspective.

Mapping Camera Detections to Real-World Coordinates

When a roadside camera detects a vehicle, it only knows where the vehicle is in the image, not in the real world.

This research used a pinhole camera model to convert 2D image positions into 3D world coordinates, factoring in the camera’s height, angle, and field of view.

The process involved calculating angular offsets and performing geometric transformations to get accurate location estimates — a perfect opportunity to apply my mathematics background.

These calculations helped validate whether detections matched actual vehicle trajectories and supported downstream tracking evaluation.

Calculations

What does this mean?

These formulas come from two similar triangles in the pinhole model. They tell us how a 3D point at depth x projects onto the image plane. Farther objects (large x) appear closer to the center of the image (small s and t), and closer objects appear farther from the center. Together, they form the familiar matrix equation that maps world (y, z) into image (s, t) via the focal length f.

Figure 1. Pinhole camera projection mapping 3D coordinates P(x, y, z) to 2D coordinates Q(s, t).

Imagine the camera is a tiny hole, and light travels in straight lines. This picture shows how a point out in space “casts” onto the flat sensor. Further points end up closer to the hole, nearer points further away.

This shows how the camera’s pixel-based focal length f ties directly to its field of view α and sensor width L. By looking at half the image width and half the FOV, we can compute f in pixels. In practice, this fixes our “zoom” factor so that angles in the real world correspond exactly to pixel distances in the image.

Figure 2. Focal length f determined from half the sensor width L/2 and half the field-of-view (FOV) α/2.

Think of stretching your arms out: half your arm-span and half your vision angle set how “zoomed-in” you are. Here, that relationship tells us exactly how many pixels the camera focal length must be.

Here, we convert a pixel’s horizontal offset Δx from the image center into an angle (φ) in the camera’s frame, using the same half-FOV / half‐width ratio. Adding the camera’s downward tilt (.θ ) gives the true horizontal ray angle xφ in world coordinates. This angle tells us the direction along the ground plane where that pixel’s ray intersects.

Figure 3. Horizontal viewing angle φ is derived from pixel offset Δx and adjusted by camera tilt θ.

If you spot something off-center on your screen, you mentally turn your head by a certain angle to look at it. This diagram shows how we calculate that angle from a pixel position, and then add the camera’s downward tilt for the true direction.

Once we know the horizontal ray angle xφ and the camera height h, we can compute how far along the ground (distance xω ) the ray hits. In effect, it converts “look-down” angle and camera height into a forward distance from the camera to the object on the flat road.

Next we account for vertical pixel offset yφ. We compute the slanted ray length to the ground (.h./.sin(xφ).), multiply by tan(yφ) to get the vertical drop, then correct for the horizontal tilt (.cos(xφ).). The result yω is how far left or right of the centerline that object lands on the ground.

Figure 4. Ground-plane intersection maps the tilted ray to forward distance xω and lateral offset yω.

Picture shining a flashlight from a balcony onto the sidewalk: you find where the light hits forward, and how far left or right it lands. This shows the same idea in math, for both forward range and side offset.

Finally, if the camera has been panned left or right by angle γ, we rotate our intermediate ground‐plane coordinates (.x′ω,.y′ω.) back into the global frame. This standard 2×2 rotation matrix aligns the measured positions with the real-world axes, giving you true global coordinates for every detected object.

Figure 5. Pan rotation by γ to align intermediate coordinates (x'ω, y'ω) with global axes.

If the camera is turned a bit to the side (pan), we simply rotate our coordinates by that angle so the final positions match the real-world map direction.

Google Sites

Report abuse