NVIDIA

Senior GPU Networking Architect

292 500 - 507 000 PLN/ rok.Umowa o pracę (brutto)

375 000 - 650 000 PLN/ rok.Umowa o pracę (brutto)

SeniorFull-time·Umowa o pracę

#320960·Dodano 28 dni temu·49

Źródło: NVIDIA

Aplikuj teraz

Tech Stack / Keywords

NetworkingAIArchitectureNetworkCUDANVSHMEMLLMPyTorch

Firma i stanowisko

NVIDIA is a technology company with over 25 years of experience in computer graphics, PC gaming, and accelerated computing. The company is focused on AI and GPU computing, developing software foundations for large-scale AI systems.

Wymagania

5+ years of hands-on CUDA programming, including writing and optimizing GPU kernels.
M.Sc. or equivalent experience in computer science, computer engineering, or related field.
Strong understanding of GPU architecture fundamentals including warp scheduling, shared memory, L2 cache, memory coalescing, occupancy tuning, and asynchronous execution.
Experience with systems-level C/C++ development in performance-critical environments.
Familiarity with GPU data movement mechanisms such as GPUDirect RDMA and GPU-initiated communication.
Ability to read and analyze GPU performance profiles (e.g., Nsight Compute, Nsight Systems) and apply optimizations.
Strong collaboration skills in a multi-national, interdisciplinary environment.

Nice to have:

Experience developing or optimizing communication kernels in libraries such as NCCL, NVSHMEM, or similar.
Understanding of distributed deep learning parallelism techniques and their communication patterns.
Background in RDMA, InfiniBand, high-speed networking, and GPU system topology including NVLink, NVSwitch, PCIe, and network fabrics.
Experience with overlap techniques like kernel pipelining, persistent kernels, or cooperative groups.
Proven experience optimizing large-scale LLM training or inference workloads and familiarity with frameworks such as PyTorch, TensorRT-LLM, or vLLM.

Obowiązki

Build, implement, and optimize GPU communication kernels for collective and point-to-point operations in large-scale AI systems.
Leverage knowledge of GPU architecture to improve kernel efficiency, minimize latency, and overlap computation with communication.
Develop GPU-resident communication primitives and device-side APIs for kernel-initiated data movement across nodes and accelerators.
Profile and tune GPU kernels end-to-end, identifying bottlenecks and driving targeted optimizations.
Collaborate with network software, hardware, and AI framework teams to co-design communication strategies aligned with GPU execution patterns.
Build proofs-of-concept, conduct experiments, and perform quantitative modeling to evaluate new communication strategies.
Contribute to the evolution of programming models exposing GPU-aware networking capabilities to application developers.

Oferta

Highly competitive salaries.
Comprehensive benefits package.

Bonusy

NVIDIA

30 aktywnych ofert

Zobacz wszystkie oferty

Aplikuj teraz