Name: yarn-logs-helpers
Owner: Hammer Lab
Description: Scripts for parsing / making sense of yarn logs
Created: 2014-11-20 04:49:13.0
Updated: 2017-11-27 12:34:42.0
Pushed: 2016-08-22 17:29:55.0
Homepage: null
Size: 37
Language: Shell
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Scripts for parsing / making sense of yarn logs.
yarn-container-logs
The main script of note here is yarn-container-logs
:
rn-container-logs 0018
It can take a full application ID (e.g. application_1416279928169_0018
) or just the last 4 digits of one (0018
).
It downloads the YARN logs for that application into a local directory (defaulting to the application ID, but can be overriden with an optional second argument, after the app ID) and splits them into per-container files:
rectory created by yarn-container-logs
application_1416279928169_0018
rectory with per-container logs
containers
r-container log files have prefix /container_/
container_*
ainer_1416279928169_0018_01_000015
ainer_1416279928169_0018_01_000016
ainer_1416279928169_0018_01_000017
e files contain exactly what was pulled down from YARN.
ad container_1416279928169_0018_01_000015
ainer: container_1416279928169_0018_01_000015 on my-node-11-10.rest.of.domain.name_port
===============================================================================================
ype: stderr
ength: 700
Contents:
It also creates a directory per node (a.k.a. “host”) containing symlinks to the log-files of all containers that ran on that node:
hosts
-l my-node-*
ode-08-1:
rwxrwx 1 <user> <group> 41 Nov 20 04:42 container_1416279928169_0018_01_000065 -> ../container_1416279928169_0018_01_000065
rwxrwx 1 <user> <group> 41 Nov 20 04:42 container_1416279928169_0018_01_000094 -> ../container_1416279928169_0018_01_000094
rwxrwx 1 <user> <group> 41 Nov 20 04:42 container_1416279928169_0018_01_000123 -> ../container_1416279928169_0018_01_000123
rwxrwx 1 <user> <group> 41 Nov 20 04:42 container_1416279928169_0018_01_000258 -> ../container_1416279928169_0018_01_000258
rwxrwx 1 <user> <group> 41 Nov 20 04:42 container_1416279928169_0018_01_000338 -> ../container_1416279928169_0018_01_000338
ode-08-10:
rwxrwx 1 <user> <group> 41 Nov 20 04:42 container_1416279928169_0018_01_000041 -> ../container_1416279928169_0018_01_000041
rwxrwx 1 <user> <group> 41 Nov 20 04:42 container_1416279928169_0018_01_000158 -> ../container_1416279928169_0018_01_000158
rwxrwx 1 <user> <group> 41 Nov 20 04:42 container_1416279928169_0018_01_000275 -> ../container_1416279928169_0018_01_000275
rwxrwx 1 <user> <group> 41 Nov 20 04:42 container_1416279928169_0018_01_000354 -> ../container_1416279928169_0018_01_000354
rwxrwx 1 <user> <group> 41 Nov 20 04:42 container_1416279928169_0018_01_000424 -> ../container_1416279928169_0018_01_000424
rename-and-link-hosts
..rest.of.domain.name_<port>
removed for brevity; this is enabled by setting the $YARN_HELPERS_DROP_HOST_SUFFIX_FROM
environment variable; see the Installing section for more details on setting $YARN_HELPERS_DROP_HOST_SUFFIX_FROM
.A common use case is parsing logs from Spark apps running on YARN, for which yarn-container-logs
has some specific functionality:
It can identify the logs corresponding to Spark driver containers. It grep
s all container logs for spark.SparkContext
to identify drivers (you can override this by setting the $YARN_HELPERS_DRIVER_GREP_NEEDLE
environment variable), and creates symlinks to them in the drivers
directory:
$ ls -l drivers
lrwxrwxrwx 1 <user> <group> 41 Nov 20 04:42 0 -> ../container_1416279928169_0018_01_000015
lrwxrwxrwx 1 <user> <group> 41 Nov 20 04:42 container_1416279928169_0018_01_000015 -> ../container_1416279928169_0018_01_000015
If exactly one was found, an additional top-level driver
symlink will point to it:
$ ls -l driver
lrwxrwxrwx 1 <user> <group> 9 Nov 20 04:42 driver -> drivers/0
This functionality lives in link-driver-logs
.
It will create a tids
directory and populate it with symlinks for each Spark task ID that it finds evidence of in the logs to the container-log-file where that TID seemingly ran.
yarn-logs-stack-traces
uses a stack-trace-parsing library on the output of yarn-logs
. Example usage:
s 0018 -d # -d means "show a histogram in descending order"
stacks in total
ccurrences:
apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 4
at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:386)
at org.apache.spark.MapOutputTracker$$anonfun$org$apache$spark$MapOutputTracker$$convertMapStatuses$1.apply(MapOutputTracker.scala:383)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
...
at java.lang.Thread.run(Thread.java:744)
ccurrences:
.io.IOException: Failed to connect to demeter-csmaz11-16.demeter.hpc.mssm.edu/172.29.46.86:33263
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:141)
at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:78)
at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
...
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
This repo contains several other scripts that basically wrap YARN commands in calls to yarn-appid
, allowing last-4-lookup of application IDs:
yarn-kill
: wrapper for yarn application -kill <appid>
.yarn-logs
: wrapper for yarn logs -applicationId <appid>
.yarn-logs-less
: pipes yarn-logs
to less
.Download this repository with:
git clone --recursive https://github.com/hammerlab/yarn-logs-helpers.git
In your .bashrc
(or equivalent), source .yarn-logs-helpers.sourceme
:
$ source /path/to/repo/.yarn-logs-helpers.sourceme
This will:
yarn-refresh-cluster-id
script.$yarn_cluster_id_file
(default: $HOME/.yarn-cluster-id
).yarn-appid
).$PATH
.Setting $YARN_LOGS_USER
may allow yarn-container-logs
to fetch logs from apps run by users other than you.
You can set it permanently in your .bashrc
to a user that has permissions to read all YARN users' logs, or just on the cmdline for one call:
_LOGS_USER=someone yarn-logs 1234
You may also want to export YARN_HELPERS_DROP_HOST_SUFFIX_FROM
(discussed above):
# Pattern for abbreviating host names when creating per-host log directories.
export YARN_HELPERS_DROP_HOST_SUFFIX_FROM=".rest.of.domain.name_"
stack-traces
submoduleFinally, ryan-williams/stack-traces is included in this repository as a git submodule, and used by yarn-log-stack-traces
.
You'll need to git clone --recursive
when you check out the project, or run git submodule init && git submodule update
from within the stack-traces
subdirectory, for it to work. git-scm.com has a good intro to using git submodules if you are not familiar.
With those done you should be all set!