View source on GitHub |
Sends grpc requests to profiler server to perform on-demand profiling.
tf.profiler.experimental.client.trace(
service_addr, logdir, duration_ms, worker_list='', num_tracing_attempts=3
)
This method will block caller thread until it receives tracing result. This method supports CPU, GPU, and Cloud TPU. This method supports profiling a single host for CPU, GPU, TPU, as well as multiple TPU workers. The profiled results will be saved to your specified TensorBoard log directory (e.g. the directory you save your model checkpoints). Use the TensorBoard profile plugin to view the visualization and analysis results.
Args | |
---|---|
service_addr
|
gRPC address of profiler service e.g. grpc://localhost:6009. |
logdir
|
Path of TensorBoard log directory e.g. /tmp/tb_log. |
duration_ms
|
Duration of tracing or monitoring in ms. |
worker_list
|
Optional. The list of workers that we are about to profile in the current session (TPU only). |
num_tracing_attempts
|
Optional. Automatically retry N times when no trace event is collected (default 3). |
Raises | |
---|---|
UnavailableError
|
If no trace event is collected. |
Example usage (CPU/GPU):
Start a profiler server before your model runs.
tf.profiler.experimental.server.start(6009)
# your model code.
# Send gRPC request to the profiler server to collect a trace of your model.
```python
tf.profiler.experimental.client.trace('grpc://localhost:6009',
'/tmp/tb_log', 2000)
Example usage (TPU):
# Send gRPC request to a TPU worker to collect a trace of your model. A
# profiler service has been started in the TPU worker at port 8466.
```python
# E.g. your TPU IP address is 10.0.0.2 and you want to profile for 2 seconds.
tf.profiler.experimental.client.trace('grpc://10.0.0.2:8466',
'gs://your_tb_dir', 2000)
Example usage (Multiple TPUs):
# Send gRPC request to a TPU pod to collect a trace of your model on multiple
# TPUs. A profiler service has been started in all the TPU workers at the
# port 8466.
```python
# E.g. your TPU IP addresses are 10.0.0.2, 10.0.0.3, 10.0.0.4, and you want to
# profile for 2 seconds.
tf.profiler.experimental.client.trace('grpc://10.0.0.2:8466',
'gs://your_tb_dir',
2000, '10.0.0.3,10.0.0.4')
Launch TensorBoard and point it to the same logdir you provided to this API.
$ tensorboard --logdir=/tmp/tb_log (or gs://your_tb_dir in the above examples)
Open your browser and go to localhost:6006/#profile to view profiling results.