Senior GPU Networking Architect
292 500 - 507 000 PLN/ rok.Umowa o pracę (brutto)
375 000 - 650 000 PLN/ rok.Umowa o pracę (brutto)
SeniorFull-time·Umowa o pracę
#320960·Dodano 28 dni temu·49
Źródło: NVIDIATech Stack / Keywords
NetworkingAIArchitectureNetworkCUDANVSHMEMLLMPyTorch
Firma i stanowisko
NVIDIA is a technology company with over 25 years of experience in computer graphics, PC gaming, and accelerated computing. The company is focused on AI and GPU computing, developing software foundations for large-scale AI systems.
Wymagania
- 5+ years of hands-on CUDA programming, including writing and optimizing GPU kernels.
- M.Sc. or equivalent experience in computer science, computer engineering, or related field.
- Strong understanding of GPU architecture fundamentals including warp scheduling, shared memory, L2 cache, memory coalescing, occupancy tuning, and asynchronous execution.
- Experience with systems-level C/C++ development in performance-critical environments.
- Familiarity with GPU data movement mechanisms such as GPUDirect RDMA and GPU-initiated communication.
- Ability to read and analyze GPU performance profiles (e.g., Nsight Compute, Nsight Systems) and apply optimizations.
- Strong collaboration skills in a multi-national, interdisciplinary environment.
Nice to have:
- Experience developing or optimizing communication kernels in libraries such as NCCL, NVSHMEM, or similar.
- Understanding of distributed deep learning parallelism techniques and their communication patterns.
- Background in RDMA, InfiniBand, high-speed networking, and GPU system topology including NVLink, NVSwitch, PCIe, and network fabrics.
- Experience with overlap techniques like kernel pipelining, persistent kernels, or cooperative groups.
- Proven experience optimizing large-scale LLM training or inference workloads and familiarity with frameworks such as PyTorch, TensorRT-LLM, or vLLM.
Obowiązki
- Build, implement, and optimize GPU communication kernels for collective and point-to-point operations in large-scale AI systems.
- Leverage knowledge of GPU architecture to improve kernel efficiency, minimize latency, and overlap computation with communication.
- Develop GPU-resident communication primitives and device-side APIs for kernel-initiated data movement across nodes and accelerators.
- Profile and tune GPU kernels end-to-end, identifying bottlenecks and driving targeted optimizations.
- Collaborate with network software, hardware, and AI framework teams to co-design communication strategies aligned with GPU execution patterns.
- Build proofs-of-concept, conduct experiments, and perform quantitative modeling to evaluate new communication strategies.
- Contribute to the evolution of programming models exposing GPU-aware networking capabilities to application developers.
Oferta
- Highly competitive salaries.
- Comprehensive benefits package.
Bonusy
NVIDIA
30 aktywnych ofert