User: Password:
|
|
Subscribe / Log in / New account

Call-graph / call-chain support

Call-graph / call-chain support

Posted Jul 5, 2009 10:28 UTC (Sun) by mingo (subscriber, #31122)
In reply to: Perfcounters added to the mainline by mingo
Parent article: Perfcounters added to the mainline

btw., another thing worth mentioning about perfcounters is turn-key call-graph support and call-graph visualization:

 $ perf record -g -f ./pipe-test-1m
 [ perf record: Captured and wrote 8.169 MB perf.data (~356901 samples) ]

 $ perf report --sort symbol --callchain fractal,5 | cat

 #
 # (80245 samples)
 #
 # Overhead  Symbol
 # ........  ......
 #
     4.50%  [k] pipe_read
                |          
                 --99.00%-- do_sync_read
                           vfs_read
                           sys_read
                           system_call_fastpath
                           __GI___libc_read
                           __libc_start_main

     4.39%  [.] main

     4.27%  [k] __switch_to
                |          
                |          |          
                |          |--50.97%-- __GI___libc_write
                |          |          
                |           --49.06%-- __GI___libc_read
                |                     __libc_start_main
                |          
                 --11.19%-- thread_return
                           |          
                           |--51.44%-- __GI___libc_write
                           |          
                            --48.83%-- __GI___libc_read
                                      __libc_start_main

     3.75%  [k] copy_user_generic_string
                |          
                |--52.11%-- do_sync_read
                |          vfs_read
                |          sys_read
                |          system_call_fastpath
                |          __GI___libc_read
                |          __libc_start_main
                |          
                 --45.39%-- pipe_write
                           do_sync_write
                           vfs_write
                           sys_write
                           system_call_fastpath
                           __GI___libc_write
                           __libc_start_main

     3.36%  [k] avc_has_perm_noaudit
                |          
                 --96.59%-- avc_has_perm
                           inode_has_perm
                           file_has_perm
                           selinux_file_permission
                           security_file_permission
                           rw_verify_area
                           |          
                           |--51.34%-- vfs_read
                           |          sys_read
                           |          system_call_fastpath
                           |          __GI___libc_read
                           |          __libc_start_main
                           |          
                            --48.66%-- vfs_write
                                      sys_write
                                      system_call_fastpath
                                      |          
                                       --99.53%-- __GI___libc_write
                                                 __libc_start_main

     3.29%  [k] schedule
                |          
                |--50.34%-- sysret_careful
                |          __GI___libc_write
                |          
                 --46.55%-- pipe_wait
                           pipe_read
                           do_sync_read
                           vfs_read
                           sys_read
                           system_call_fastpath
                           __GI___libc_read
                           __libc_start_main

     2.89%  [k] switch_mm
                |          
                 --97.67%-- schedule
                           |          
                           |--50.75%-- pipe_wait
                           |          pipe_read
                           |          do_sync_read
                           |          vfs_read
                           |          sys_read
                           |          system_call_fastpath
                           |          __GI___libc_read
                           |          __libc_start_main
                           |          
                            --49.29%-- sysret_careful
                                      __GI___libc_write

     2.85%  [.] __GI___libc_write

     2.70%  [.] __GI___libc_read

     2.60%  [k] file_has_perm
                |          
                |--93.67%-- selinux_file_permission
                |          security_file_permission
                |          rw_verify_area
                |          |          
                |          |--55.02%-- vfs_write
                |          |          sys_write
                |          |          system_call_fastpath
                |          |          |          
                |          |           --99.81%-- __GI___libc_write
                |          |                     __libc_start_main
                |          |          
                |           --44.98%-- vfs_read
                |                     sys_read
                |                     system_call_fastpath
                |                     __GI___libc_read
                |                     __libc_start_main
                |          
                 --6.33%-- security_file_permission
                           rw_verify_area
                           |          
                           |--52.27%-- vfs_write
                           |          sys_write
                           |          system_call_fastpath
                           |          |          
                           |          |--97.10%-- __GI___libc_write
                           |          |          __libc_start_main
                           |          |          
                           |           --4.35%-- __write_nocancel
                           |          
                            --47.73%-- vfs_read
                                      sys_read
                                      system_call_fastpath
                                      __GI___libc_read
                                      __libc_start_main

     2.15%  [k] pipe_write
                |          
                 --98.31%-- do_sync_write
                           vfs_write
                           sys_write
                           system_call_fastpath
                           __GI___libc_write
                           __libc_start_main

     2.05%  [k] system_call
                |          
                |--50.64%-- __GI___libc_write
                |          |          
                |           --49.46%-- __libc_start_main
                |          
                 --49.24%-- __GI___libc_read
                           __libc_start_main

Here we record and output full call-chains (down to and including user-space call-chains) and display the overhead in a tree - detailing the call-path that results in that profile entry - and recursively so. (the '5' is a 5% filter - to skip entries below a 5% (relative-)overhead threshold)

For example this portion:

     3.75%  [k] copy_user_generic_string
                |          
                |--52.11%-- do_sync_read
                |          vfs_read
                |          sys_read
                |          system_call_fastpath
                |          __GI___libc_read
                |          __libc_start_main
                |          
                 --45.39%-- pipe_write
                           do_sync_write
                           vfs_write
                           sys_write
                           system_call_fastpath
                           __GI___libc_write
                           __libc_start_main
Tells us that in this workload there's a combined overhead of 3.75% from user-copies (copy_user_generic_string()), and that ~52% overhead of that comes from a user-space read() and 45% comes from a user-space write() call.

With traditional 'flat' profiling output we'd only know that there's 3.75% overhead in copy_user_generic_string() - we would not know where it comes from.


(Log in to post comments)

Call-graph / call-chain support

Posted Jul 5, 2009 11:52 UTC (Sun) by njs (guest, #40338) [Link]

That's lovely. What would be even more lovely would be the ability to output KCacheGrind's format (documentation), or at least something easily parseable (with less ascii art) and thus mungeable into said format, if that's possible?


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds