Name: systemtap-toolkit
Owner: ??
Description: YouZan systemtap toolkit to online analyze on production
Created: 2016-11-09 03:21:25.0
Updated: 2018-05-11 00:11:31.0
Pushed: 2017-04-14 13:27:37.0
Size: 80
Language: Perl
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
systemtap-toolkit
This is @YouZan systemtap toolkit to online analyze the complicated problem on production with heavy load. All tools are based on the amazing linux tracing/probing tool systemtap.
Any guys which want to know what the hell it is in the user space and kernel space should be to learn systemtap which is awesome tool:)
We need systemtap and dwarf. some scripts are working on kernel space and other is working on the user space.
For kernel space, we need kernel debuginfo like kernel-debuginfo-3.10.0-327.28.3.el7.x86_64
.
For user space, we need user application debuginfo like redis-debuginfo-2.8.19-2.el7.x86_64
.
For redhat* linux version, we can install as the following:
install yum-utils #for debuginfo-install
install systemtap
install kernelname-devel-version
ginfo-install kernelname-version
You can choose the ways as the following to help this project.
Special thanks to @brendangregg? @agentzh and @fche. All we have learn for systemtap is from their amazing blog posts and projects:)
It's used to measure the time of syn packet to ack packet on the server side in the tcp-3-shakehands(Thanks tcpguide).
t@localhost tmp]# ./tcp-passive-syn-ack-time -p 80 -t 5000
ecting tcp dport (80)...syn-ack time
rval min:197us, max:858us avg:519us, cnt:3
e |-------------------------------------------------- count
2 | 0
4 | 0
8 |@ 1
6 |@ 1
2 |@ 1
4 | 0
8 | 0
It's used to measure the time of syn packet to ack packet on the client side in the tcp-3-shakehands(Thanks tcpguide).
t@localhost systemtap-toolkit]# ./tcp-active-syn-ack-time -p 80 -t 5000
ecting tcp dport (80)...syn-ack time
t:80 min:417us, max:542us avg:460us, cnt:3
e |-------------------------------------------------- count
4 | 0
8 | 0
6 |@@ 2
2 |@ 1
4 | 0
8 | 0
It's used to collecting which tcp packet being retransmit
t@localhost systemtap-toolkit]# ./tcp-retrans
ting tcp retransmission
.2.15:49896 -> 172.17.9.41:80 state:TCP_SYN_SENT rto:0 -> 1000 ms
.2.15:49896 -> 172.17.9.41:80 state:TCP_SYN_SENT rto:1000 -> 2000 ms
.2.15:49896 -> 172.17.9.41:80 state:TCP_SYN_SENT rto:2000 -> 4000 ms
It's used to find who is opening the specified file
t@localhost systemtap-toolkit]# ./who-open-file -f 123 -t 10000
ecting who is opening filename 123
13740) is opening the filename: "123"
13741) is opening the filename: "123"
Tracing context switch for specified process.
t@localhost systemtap-toolkit]# ./who-ctxswitch-process -p 6354
ecting who is context switch 6354
swapper/0 ( 0)<R> => nginx ( 6354)<R>
nginx ( 6354)<S> => nginx ( 6355)<R>
nginx ( 6355)<D> => nginx ( 6354)<R>
nginx ( 6354)<S> => rcu_sched ( 10)<R>
nginx ( 6355)<D> => nginx ( 6354)<R>
It's used to tracing syscall.connect
et(8062) is connecting to AF_INET@192.168.33.10:1800
et(8063) is connecting to AF_INET@192.168.33.10:1800
et(8064) is connecting to AF_INET@192.168.33.10:1800
et(8065) is connecting to AF_INET@192.168.33.10:1800
et(8066) is connecting to AF_INET@192.168.33.10:1800
et(8067) is connecting to AF_INET@192.168.33.10:1800
et(8068) is connecting to AF_INET@192.168.33.10:1800
et(8069) is connecting to AF_INET@192.168.33.10:1800
et(8070) is connecting to AF_INET@192.168.33.10:1800
It's from agentzh and be used to sampling the backtrace in the user space and kernel space.
sample-bt -p 8736 -t 5 -u > a.bt
ING: Tracing 8736 (/opt/nginx/sbin/nginx) in user-space only...
ING: Missing unwind data for module, rerun with 'stap -d stap_df60590ce8827444bfebaf5ea938b5a_11577'
ING: Time's up. Quitting now...(it may take a while)
ING: Number of errors: 0, skipped probes: 24
It's used to monitor function param changing.
t@localhost systemtap-toolkit]# ./watch-var -f syscall.open -v filename -p 25849
ING: Tracing vars syscall.open filename in 25849...
t[25849] kernel.function("SyS_open@fs/open.c:1036").call filename: "" => ""./test""
Like tcpdump, it's used to tracing tcp packet with more detail include tcp flag.
t@localhost systemtap-toolkit]# ./tcp-trace-packet
ING: tracking 0 tcp packet
067249998698 10.0.2.15:22 => 10.0.2.2:50627 len:92 SYN:0 ACK:1 FIN:0 RST:0 PSH:1 URG:0
067249998955 10.0.2.2:50627 <= 10.0.2.15:22 len:40 SYN:0 ACK:1 FIN:0 RST:0 PSH:0 URG:0
067250199252 10.0.2.15:22 => 10.0.2.2:50627 len:172 SYN:0 ACK:1 FIN:0 RST:0 PSH:1 URG:0
067250199559 10.0.2.2:50627 <= 10.0.2.15:22 len:40 SYN:0 ACK:1 FIN:0 RST:0 PSH:0 URG:0
067250399756 10.0.2.15:22 => 10.0.2.2:50627 len:100 SYN:0 ACK:1 FIN:0 RST:0 PSH:1 URG:0
067250399963 10.0.2.2:50627 <= 10.0.2.15:22 len:40 SYN:0 ACK:1 FIN:0 RST:0 PSH:0 URG:0
It tracing the userland, which can watch and filter by specified condition nginx request in real time
t@localhost systemtap-toolkit]# ./ngx-req-watch -p 5614
ING: watching /opt/tengine/sbin/nginx(8521 8522 8523 8524) requests
x(8523) GET URI:/123?a=123 HOST:127.0.0.1 STATUS:200 FROM 127.0.0.1 FD:16 RT: 0ms
x(8523) GET URI:/123?a=123 HOST:127.0.0.1 STATUS:200 FROM 127.0.0.1 FD:16 RT: 0ms
x(8523) GET URI:/123?a=123&b=123 HOST:127.0.0.1 STATUS:200 FROM 127.0.0.1 FD:16 RT: 0ms
x(8523) GET URI:/123?w HOST:127.0.0.1 STATUS:200 FROM 127.0.0.1 FD:16 RT: 0ms
x(8523) GET URI:/123?w HOST:test STATUS:200 FROM 127.0.0.1 FD:16 RT: 0ms
x(8523) GET URI:/123?w=a HOST:test STATUS:200 FROM 127.0.0.1 FD:16 RT: 0ms
Like strace. But it's based on the systemtap
t@localhost systemtap-toolkit]# ./stracelike -p 4580 -t 20000
ING: stracing syscall
Oct 29 12:46:19 2016.094410 epoll_wait(16, 0x1e17b40, 512, 100) = 0 <0.100334>
Oct 29 12:46:19 2016.194756 epoll_wait(16, 0x1e17b40, 512, 100) = 0 <0.100227>
Oct 29 12:46:19 2016.295006 epoll_wait(16, 0x1e17b40, 512, 100) = 0 <0.101086>
It tracing the userland, which can watch and filter by specified condition redis request in real time
t@localhost systemtap-toolkit]# ./redis-watch-req -p 23261
ING: watching /usr/bin/redis-server(23261) requests
s-server(23261) RT:30(us) REQ: id:2 fd:5 ==> get a #-1 RES: #9
s-server(23261) RT:23(us) REQ: id:2 fd:5 ==> set a #12 RES: #5
s-server(23261) RT:16(us) REQ: id:2 fd:5 ==> get foo #-1 RES: #5
It traceing the userland, which can watch and filter by specified condition request for softawre which are based on the libcurl like curl
and php
.
t@localhost systemtap-toolkit]# ./libcurl-watch-req
ING: Tracing libcurl (0) ...
(23759) URL:http://www.google.com RT:448(ms) RTCODE:0
(23767) URL:http://www.facebook.com/asdfasdf RT:596(ms) RTCODE:0
(23769) URL:https://www.facebook.com/asdfasdf RT:902(ms) RTCODE:0
It traceing the userland, which can watch and filter by specified condition request for php's pdo mysql driver.
t@localhost systemtap-toolkit]# ./pdomysql-watch-query -l /usr/lib64/php/modules/pdo_mysql.so
ing pdo-mysql (0)
fpm(12896) 172.17.10.196:3306@root: SELECT * from person RT:0(ms) RTCODE:1
fpm(12896) 172.17.10.196:3306@root: SELECT * from person RT:8(ms) RTCODE:1
fpm(12896)172.17.10.196:3306@root: SELECT sleep(5) RT:5012(ms) RTCODE:1
It traceing the userland, which can trace the php redis request
t@localhost systemtap-toolkit]# ./phpredis-watch-req -l /usr/lib64/php/modules/redis.so
ing phpredis (/usr/lib64/php/modules/redis.so)
17226)<zim_Redis___construct[22us]>
17226)<zim_Redis_connect[113us]>
17226)<zim_Redis_get[157us]>:*2 $3 GET $3 key
17226)<zim_Redis_hGet[563us]>:*3 $4 HGET $3 key $6 ffffff
17226)<zim_Redis_set[617us]>:*3 $3 SET $3 key $4 abcd
17226)<zim_Redis___destruct[12us]>
It traceing io Read|Write with the view of process(pid).
t@localhost systemtap-toolkit]# ./io-process-top -t 1000
ING: Collecting IO Process Top 10 with interval of 1000ms
Process Name Read(KB) Write(KB)
redis-server(4510) 3 0
stapio(28280) 2 0
systemd-journal(442) 0 0
systemd(1) 0 0
sshd(19948) 0 0
in:imjournal(595) 0 0
It traceing net Send|Recv with the view of process(pid).
t@localhost systemtap-toolkit]# ./net-process-top -t 5000
ING: Collecting Net Process Top 10 with interval of 5000ms
Process( 0) dev Send(PK) Recv(PK) Send(KB) Recv(KB)
nginx( 7266) lo 446203 0 144471 0
wrk(27496) lo 156601 0 15599 0
rcu_sched( 10) eth0 0 1 0 0
sshd( 6890) eth0 1 0 0 0
It tracing the libnss_dns.so for dns query.
t@localhost systemtap-toolkit]# ./nssdns-watch-question -l /usr/lib64/libnss_dns.so. -t 100000
ING: Tracing libnss_dns(/usr/lib64/libnss_dns.so.2) for pid:0
(11786): www.google.com 57994us
(11788): www.facebook.com 57406us
(11790): www.github.com 4203477us
It tracing phpfpm request
phpfpm-watch-req -l /opt/php/sbin/php-fpm
ING: Tracing php-fpm for pid(0)
fpm(9665) GET /index.php?&123123=123&f=q (208us)
fpm(9665) GET /index.php?&123123=123&f=q (172us)
fpm(9665) GET /index.php?&123123=123&f=q (154us)
fpm(9665) GET /index.php?&123123=123&f=q (151us)
It tracing swoole-redis write and read subroutine
oole-redis-watch -l /opt/php/lib/php/extensions/no-debug-non-zts-20131226/swoole.so -t 100000
ING: Tracing swoole.so(/opt/php/lib/php/extensions/no-debug-non-zts-20131226/swoole.so) for pid:0
25927) is writing for 10.200.175.90:6379 to size(41) *3
chment
00.175.90:6379 get reply: integer:710263830
29486) is writing for 10.200.175.90:6379 to size(41) *3
chment
00.175.90:6379 get reply: integer:709720993