voxcity.simulator_gpu.solar.reflection ====================================== .. py:module:: voxcity.simulator_gpu.solar.reflection .. autoapi-nested-parse:: Optimized Radiation Computation for GPU This module provides optimized GPU kernels for radiation computation that minimize kernel launches and synchronization overhead. Key optimizations: 1. Fused kernels - combine multiple operations into single kernel launches 2. Reduced synchronization - batch operations to minimize ti.sync() calls 3. Better memory access patterns - coalesced memory access 4. Reduced atomic operations - use local accumulation where possible Attributes ---------- .. autoapisummary:: voxcity.simulator_gpu.solar.reflection.Vector3 voxcity.simulator_gpu.solar.reflection.gpu_times Classes ------- .. toctree:: :hidden: /autoapi/voxcity/simulator_gpu/solar/reflection/OptimizedReflectionSolver .. autoapisummary:: voxcity.simulator_gpu.solar.reflection.OptimizedReflectionSolver Functions --------- .. autoapisummary:: voxcity.simulator_gpu.solar.reflection.fused_reflection_step_kernel voxcity.simulator_gpu.solar.reflection.compute_initial_and_reflections_fused voxcity.simulator_gpu.solar.reflection.benchmark_reflections Module Contents --------------- .. py:data:: Vector3 .. py:function:: fused_reflection_step_kernel(surfins_in: ti.template(), surfins_out: ti.template(), surfout: ti.template(), albedo: ti.template(), svf: ti.template(), total_incoming: ti.template(), total_outgoing: ti.template(), svf_source: ti.template(), svf_target: ti.template(), svf_vf: ti.template(), svf_trans: ti.template(), svf_nnz: taichi.i32, n_surfaces: taichi.i32) Single fused kernel for one reflection step. Combines: outgoing computation + distribution + accumulation into fewer synchronization points. .. py:function:: compute_initial_and_reflections_fused(surf_direction: ti.template(), surf_svf: ti.template(), surf_shadow: ti.template(), surf_canopy_trans: ti.template(), surf_albedo: ti.template(), surf_normal: ti.template(), sun_dir_x: taichi.f32, sun_dir_y: taichi.f32, sun_dir_z: taichi.f32, cos_zenith: taichi.f32, sw_direct: taichi.f32, sw_diffuse: taichi.f32, svf_source: ti.template(), svf_target: ti.template(), svf_vf: ti.template(), svf_trans: ti.template(), svf_nnz: taichi.i32, n_surfaces: taichi.i32, n_ref_steps: taichi.i32, sw_in_direct: ti.template(), sw_in_diffuse: ti.template(), sw_in_reflected: ti.template(), sw_out_total: ti.template(), surfins_a: ti.template(), surfins_b: ti.template(), surfout: ti.template()) Fully fused kernel: initial radiation + all reflection iterations. This is the most optimized version that runs everything in one kernel. .. py:function:: benchmark_reflections() Benchmark the reflection solver. .. py:data:: gpu_times