Training

Nsight Compute training content.

NVIDIA Nsight Compute Training resources.

External Resources

Forum

Blogs

Videos

Code Examples

  • Have a look at our coding examples on GitHub

Usage Examples

Filter Options

Note that examples will use the term workload to refer to either kernels, graphs, ranges, or cmdlists unless stated otherwise.

  1. Profile first two workloads

    --launch-count 2

  2. Profile the first two workloads launched on the device with device ID 1

    ../_images/filter-example-2.png

    --device 1 --launch-count 2

  3. Profile the 2nd workload on each GPU

    ../_images/filter-example-3.png

    --launch-skip 1 --launch-count 1 --filter-mode per-gpu

  4. Skip the first 2 workloads of each launch configuration before profiling

    --launch-skip 2 --filter-mode per-launch-config

  5. Profile “Bar” kernel

    ../_images/filter-example-5.png

    --kernel-name Bar

  6. Profile kernels that have “Bar” in their function name

    ../_images/filter-example-6.png

    --kernel-name regex:Bar

  7. Profile only the 2nd invocation of kernel “Foo”

    ../_images/filter-example-7.png

    --kernel-id ::Foo:2

  8. Profile only the 3rd invocation of kernel “Foo” launched across all CUDA devices

    ../_images/filter-example-17.png

    --kernel-id ::Foo:3

  9. Profile only the 2nd invocation of kernel “Foo” launched on the CUDA device with device ID 1

    ../_images/filter-example-18.png

    --kernel-id ::Foo:2 --device 1

  10. Profile only the 2nd invocation of kernel “Foo” launched on each CUDA device

    ../_images/filter-example-19.png

    --kernel-id ::Foo:2 --filter-mode per-gpu

  11. Profile only the 2nd invocation of kernel “Foo” launched with each unique launch configuration, i.e., grid size, block size, and shared memory bytes

    ../_images/filter-example-20.png

    --kernel-id ::Foo:2 --filter-mode per-launch-config

  12. Profile only the 2nd invocation of all kernels that have “Bar” in their name

    ../_images/filter-example-8.png

    --kernel-id ::regex:Bar:2

  13. Skip the first 2 workloads before matching “Foo” or “Bar” in kernel names

    ../_images/filter-example-9.png

    --launch-skip-before-match 2 --kernel-name regex:“Foo|Bar”

  14. Profile every 7th kernel invocation with mangled name “_FooBar” on CUDA context ID 1 and stream ID 2

    --kernel-id 1:2:_Foobar:7 --kernel-name-base mangled

  15. Profile only the 7th invocation of kernel “Foo”, regardless of context ID and stream ID

    --kernel-id ::Foo:7

  16. Profile every 7th invocation of kernel “Foo” launched on stream ID 1 with NVTX stream name “cuda_stream”, regardless of context ID

    --kernel-id :1|cuda_stream:Foo:7 --nvtx

  17. Profile all workloads launched in the first 3 ranges created by cu(da)ProfilerStart/Stop APIs

    ../_images/filter-example-11.png

    --range-filter :[1-3]:

  18. Profile all workloads launched in the 2nd NVTX Push/Pop range A

    ../_images/filter-example-12.png

    --range-filters ::2 --nvtx --nvtx-include A/

  19. Profile all workloads launched in NVTX Push/Pop range A except the ones in NVTX Push/Pop range B

    ../_images/filter-example-13.png

    --nvtx --nvtx-include A/ --nvtx-exclude B/

  20. Profile all “Foo” kernels except those launched in NVTX Push/Pop range B

    ../_images/filter-example-14.png

    --nvtx --nvtx-exclude B/ --kernel-name Foo

  21. Profile all workloads launched in the 2nd NVTX Start/End range A inside the 2nd range created by cu(da)ProfilerStart/Stop APIs

    ../_images/filter-example-15.png

    --range-filter yes:2:2 --nvtx --nvtx-include A

  22. Profile all workloads launched in the 1st NVTX Push/Pop range A inside both the 1st and 2nd ranges created by cu(da)ProfilerStart/Stop APIs

    ../_images/filter-example-16.png

    --range-filter yes:[1-2]:1 --nvtx --nvtx-include A/

  23. Profile all workloads launched in the 1st range created by cu(da)ProfilerStart/Stop APIs with the 2nd NVTX Push/Pop range A and domain D

    --range-filter no:1:2 --nvtx --nvtx-include D@A/

Notices

Notices

ALL NVIDIA DESIGN SPECIFICATIONS, REFERENCE BOARDS, FILES, DRAWINGS, DIAGNOSTICS, LISTS, AND OTHER DOCUMENTS (TOGETHER AND SEPARATELY, “MATERIALS”) ARE BEING PROVIDED “AS IS.” NVIDIA MAKES NO WARRANTIES, EXPRESSED, IMPLIED, STATUTORY, OR OTHERWISE WITH RESPECT TO THE MATERIALS, AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF NONINFRINGEMENT, MERCHANTABILITY, AND FITNESS FOR A PARTICULAR PURPOSE.

Information furnished is believed to be accurate and reliable. However, NVIDIA Corporation assumes no responsibility for the consequences of use of such information or for any infringement of patents or other rights of third parties that may result from its use. No license is granted by implication of otherwise under any patent rights of NVIDIA Corporation. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all other information previously supplied. NVIDIA Corporation products are not authorized as critical components in life support devices or systems without express written approval of NVIDIA Corporation.

Trademarks

NVIDIA and the NVIDIA logo are trademarks or registered trademarks of NVIDIA Corporation in the U.S. and other countries. Other company and product names may be trademarks of the respective companies with which they are associated.