Thicket and Extra-P: Thicket Tutorial

Thicket is a python-based toolkit for Exploratory Data Analysis (EDA) of parallel performance data that enables performance optimization and understanding of applications’ performance on supercomputers. It bridges the performance tool gap between being able to consider only a single instance of a simulation run (e.g., single platform, single measurement tool, or single scale) and finding actionable insights in multi-dimensional, multi-scale, multi-architecture, and multi-tool performance datasets.

NOTE: An interactive version of this notebook is available in the Binder environment.

Binder

Thicket Modeling Example

This notebook provides an example for using Thicket’s modeling feature. The modeling capability relies on Extra-P - a tool for empirical performance modeling. It can perform N-parameter modeling with up to 3 parameters (N <= 3). The models follow a so-called Performance Model Normal Form (PMNF) that expresses models as a summation of polynomial and logarithmic terms. One of the biggest advantages of this modeling method is that the produced models are human-readable and easily understandable.


To explore the capabilities of thicket with Extra-P, we begin by importing necessary packages.

[1]:
import sys

import matplotlib.pyplot as plt
import pandas as pd
from IPython.display import display
from IPython.display import HTML

import thicket as th
from thicket.model_extrap import Modeling

display(HTML("<style>.container { width:80% !important; }</style>"))

In this example, we use an MPI scaling study, profiled with Caliper, that has metadata about the runs. The data is also already aggregated, which means we can provide the data to Extra-P as-is.

[2]:
data = "../data/mpi_scaling_cali"
t_ens = th.Thicket.from_caliperreader(data)

Specifically, the metadata table for this set of profiles contains a jobsize column, which provides the amount of cores used for each profile.

[3]:
t_ens.metadata["jobsize"]
[3]:
profile
-8529698857407510100     27
-4573114402704839186    216
-2577015563847349132     64
-356820152305396749     125
 270342274496300704     343
Name: jobsize, dtype: int64

You can use the help() method within Python to see the information for a given object. You can do this by typing help(object). This will allow you to see the arguments for the function, and what will be returned. An example is below.

[4]:
help(Modeling)
Help on class Modeling in module thicket.model_extrap:

class Modeling(builtins.object)
 |  Modeling(tht, param_name, params=None, chosen_metrics=None)
 |
 |  Produce models for all the metrics across the given graphframes.
 |
 |  Methods defined here:
 |
 |  __init__(self, tht, param_name, params=None, chosen_metrics=None)
 |      Create a new model object.
 |
 |      Adds a model column for each metric for each common frame across all the
 |      graphframes.
 |
 |      The given list of params contains the parameters to build the models.  For
 |      example, MPI ranks, input sizes, and so on.
 |
 |      Arguments:
 |          tht (Thicket): thicket object
 |          param_name (str): arbitrary if 'params' is being provided, otherwise name of
 |              the metadata column from which 'params' will be extracted
 |          params (list): parameters list, domain for the model
 |          chosen_metrics (list): metrics to be evaluated in the model, range for the
 |              model
 |
 |  componentize_statsframe(self, columns=None)
 |      Componentize multiple Extra-P modeling objects in the aggregated statistics
 |      table
 |
 |      Arguments:
 |          column (list): list of column names in the aggregated statistics table to
 |              componentize. Values must be of type 'thicket.model_extrap.ModelWrapper'.
 |
 |  produce_models(self, agg_func=<function mean at 0x7fcffb925160>, add_stats=True)
 |      Produces an Extra-P model. Models are generated by calling Extra-P's
 |          ModelGenerator.
 |
 |      Arguments:
 |          agg_func (function): aggregation function to apply to multi-dimensional
 |              measurement values. Extra-P v4.0.4 applies mean by default so that is
 |              set here for clarity.
 |          add_stats (bool): Option to add hypothesis function statistics to the
 |              aggregated statistics table
 |
 |  to_html(self, RSS=False)
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |
 |  __dict__
 |      dictionary for instance variables (if defined)
 |
 |  __weakref__
 |      list of weak references to the object (if defined)

First, we construct the Modeling object by passing all the relevant data to it. We provide jobsize as the param_name argument so the model will grab this column from the metadata table to use as our parameter. We also sub-select some metrics, since this dataset has a lot of metrics (otherwise the modeling will take a long time to do all metrics).

Then, we call produce_models on that object (it’s unnecessary to provide an aggregation function since the data is already aggregated.).

NOTE: For this example, you can view all the metric columns by adding a new cell and running: t_ens.performance_cols.

[5]:
mdl = Modeling(
    t_ens,
    "jobsize",
    chosen_metrics=[
        "Total time",
    ],
)

mdl.produce_models()

Model hypothesis functions are stored in thicket’s aggregated statistics table.

[6]:
t_ens.statsframe.dataframe
[6]:
name Total time_extrap-model Total time_RSS_extrap-model Total time_rRSS_extrap-model Total time_SMAPE_extrap-model Total time_AR2_extrap-model Total time_RE_extrap-model
node
{'name': 'MPI_Allreduce', 'type': 'function'} MPI_Allreduce -0.002483573830186516 + 4.672313710732996e-09 ... 2.373695e-02 33.638540 89.818388 0.982806 1.569021
{'name': 'MPI_Bcast', 'type': 'function'} MPI_Bcast 0.005594622766668955 + 1.1211777143604533e-05 ... 5.234884e-03 0.592849 23.469301 0.994136 0.222561
{'name': 'MPI_Comm_dup', 'type': 'function'} MPI_Comm_dup 0.2071419993096192 + 0.00037948721323388787 * ... 1.468475e+01 1.458439 62.937895 0.324860 0.467371
{'name': 'MPI_Comm_free', 'type': 'function'} MPI_Comm_free 2.9748964461513132e-05 + 2.9632810629043053e-0... 4.012398e-08 0.026386 7.122115 0.995381 0.054844
{'name': 'MPI_Comm_split', 'type': 'function'} MPI_Comm_split 0.03409920994697457 + 4.861767349462248e-07 * ... 4.094649e+00 1.584798 51.159733 0.904711 0.445655
{'name': 'MPI_Gather', 'type': 'function'} MPI_Gather 1.111071710084133e-05 + 1.875931772029738e-09 ... 1.805567e-07 2.319450 46.763355 0.857675 0.515782
{'name': 'MPI_Initialized', 'type': 'function'} MPI_Initialized -1.2157511991332251e-06 + 4.4736500077363466e-... 2.079887e-09 0.016595 5.660712 0.997857 0.046511
{'name': 'main', 'type': 'function'} main 5.334585402555236 + 50.44722398708033 * p^(1) 1.428021e+07 0.132731 19.564485 0.886893 0.157082
{'name': 'MPI_Barrier', 'type': 'function'} MPI_Barrier 1.0395240000000001 7.149934e+00 49051.554796 131.650754 1.000000 0.455439
{'name': 'MPI_Irecv', 'type': 'function'} MPI_Irecv 0.00016930423157775063 + 5.344486361848513e-05... 5.902043e-04 0.048232 10.546219 0.985537 0.084786
{'name': 'MPI_Isend', 'type': 'function'} MPI_Isend -0.7378766027055432 + 0.011372374310428852 * p... 5.386096e-01 5023.400772 100.382880 0.937724 19.620193
{'name': 'MPI_Reduce', 'type': 'function'} MPI_Reduce 0.00895051336452584 + 2.155160690292718e-05 * ... 1.756609e-03 32.371981 61.564145 0.436922 1.295318
{'name': 'MPI_Wait', 'type': 'function'} MPI_Wait -0.08273770855385455 + 0.000898503598134379 * ... 2.613648e-01 15.106425 63.097699 0.567859 1.000543
{'name': 'MPI_Waitall', 'type': 'function'} MPI_Waitall 0.0118324 1.275860e-03 282.767881 109.810235 1.000000 0.508850
{'name': 'lulesh.cycle', 'type': 'function'} lulesh.cycle 6.361023656694157 + 50.400696591892334 * p^(1) 1.431665e+07 0.133362 19.610254 0.886407 0.157454
{'name': 'LagrangeLeapFrog', 'type': 'function'} LagrangeLeapFrog -588.7139217783036 + 118.0201880558251 * p^(4/5) 3.026414e+06 0.040647 9.523960 0.947871 0.076241
{'name': 'CalcTimeConstraintsForElems', 'type': 'function'} CalcTimeConstraintsForElems 0.2912128378771599 + 0.15464468104595394 * p^(1) 6.423611e-01 0.010278 3.339056 0.999446 0.027639
{'name': 'LagrangeElements', 'type': 'function'} LagrangeElements 55.80235706834334 + 12.219839260204244 * p^(1) 2.180612e+05 0.041861 11.178410 0.970060 0.089881
{'name': 'ApplyMaterialPropertiesForElems', 'type': 'function'} ApplyMaterialPropertiesForElems 6.9091923295994295 + 3.812718071421943 * p^(1) 1.344775e+03 0.001975 2.050317 0.998092 0.016486
{'name': 'EvalEOSForElems', 'type': 'function'} EvalEOSForElems 7.437812732121649 + 3.7197759010830898 * p^(1) 1.057620e+03 0.001701 1.905893 0.998424 0.015326
{'name': 'CalcEnergyForElems', 'type': 'function'} CalcEnergyForElems 7.361621962061197 + 2.352394804115736 * p^(1) 1.943577e+02 0.001304 1.863541 0.999275 0.015001
{'name': 'CalcLagrangeElements', 'type': 'function'} CalcLagrangeElements -3.8072048974340813 + 1.5032184055481053 * p^(... 7.778453e+02 0.002324 2.324551 0.998118 0.018461
{'name': 'CalcKinematicsForElems', 'type': 'function'} CalcKinematicsForElems -3.1910637499813572 + 1.4407396344757237 * p^(... 9.575080e+02 0.002845 2.418832 0.997478 0.019204
{'name': 'CalcQForElems', 'type': 'function'} CalcQForElems 6.9792314208895 + 2.7990338043031575 * p^(3/4)... 3.048464e+05 0.191824 21.996816 0.795277 0.175135
{'name': 'CalcMonotonicQForElems', 'type': 'function'} CalcMonotonicQForElems -4.0210925162888875 + 0.4959594252012758 * p^(... 1.628638e+02 0.014441 5.668751 0.996383 0.044701
{'name': 'MPI_Irecv', 'type': 'function'} MPI_Irecv 0.020640239201915305 + 0.0009474689031858099 *... 1.147505e-02 0.003239 2.874863 0.997552 0.023065
{'name': 'MPI_Isend', 'type': 'function'} MPI_Isend -0.10548527169972818 + 0.004772194045454103 * ... 1.151804e+00 0.009142 4.727088 0.990329 0.038011
{'name': 'MPI_Wait', 'type': 'function'} MPI_Wait 316.42006979999996 2.830610e+05 7.769940 57.079604 1.000000 0.588484
{'name': 'MPI_Waitall', 'type': 'function'} MPI_Waitall -28.320030462369225 + 0.564949832321966 * p^(1... 1.854066e+03 7.732766 50.080415 0.819884 0.720597
{'name': 'LagrangeNodal', 'type': 'function'} LagrangeNodal -501.7214996437175 + 103.6028442396303 * p^(3/4) 1.307338e+06 0.052532 11.924869 0.945355 0.096236
{'name': 'CalcForceForNodes', 'type': 'function'} CalcForceForNodes -483.43658919501894 + 94.44404618163017 * p^(3/4) 7.615735e+05 0.028787 7.967296 0.961502 0.063767
{'name': 'CalcVolumeForceForElems', 'type': 'function'} CalcVolumeForceForElems -8.924824714806517 + 18.087142774934247 * p^(1) 4.205287e+04 0.007481 4.064200 0.997349 0.033025
{'name': 'CalcHourglassControlForElems', 'type': 'function'} CalcHourglassControlForElems -18.407641701933056 + 15.299434861302807 * p^(1) 3.933893e+04 0.005348 3.804897 0.996535 0.030703
{'name': 'CalcFBHourglassForceForElems', 'type': 'function'} CalcFBHourglassForceForElems -3.4541479782958793 + 1.9523476557370112 * p^(... 4.555233e+02 0.007429 3.570690 0.999346 0.028786
{'name': 'IntegrateStressForElems', 'type': 'function'} IntegrateStressForElems -3.2210357806507117 + 1.315518813127615 * p^(3... 9.669293e+02 0.003183 2.796725 0.996946 0.022230
{'name': 'MPI_Irecv', 'type': 'function'} MPI_Irecv 0.07139784821037126 + 0.001291317091147029 * p... 1.217607e-02 0.038615 6.042882 0.999487 0.051334
{'name': 'MPI_Isend', 'type': 'function'} MPI_Isend -0.1347716501223986 + 0.006980575896793848 * p... 4.308979e-01 0.005076 3.554862 0.998306 0.028019
{'name': 'MPI_Wait', 'type': 'function'} MPI_Wait 200.4196336 6.491636e+04 23.569468 57.976850 1.000000 0.363086
{'name': 'MPI_Waitall', 'type': 'function'} MPI_Waitall 367.49135739999997 3.655812e+05 580334.061025 78.875714 1.000000 0.513362
{'name': 'MPI_Irecv', 'type': 'function'} MPI_Irecv -0.055281882470313876 + 0.005354748856582523 *... 7.789738e-01 0.144243 16.670493 0.856270 0.133853
{'name': 'MPI_Isend', 'type': 'function'} MPI_Isend -0.037964774493256734 + 0.008521820556730944 *... 9.730921e+00 0.178043 21.806582 0.867448 0.182547
{'name': 'MPI_Wait', 'type': 'function'} MPI_Wait 60.78119099999999 9.787083e+03 304.830597 77.909515 1.000000 0.411681
{'name': 'MPI_Waitall', 'type': 'function'} MPI_Waitall -90.56809038671462 + 5.542383651985729 * log2(... 6.315946e+04 2.822066 46.254668 -0.214628 0.477486
{'name': 'TimeIncrement', 'type': 'function'} TimeIncrement 108.79336164114491 + 0.41659851800619807 * p^(... 5.231852e+06 0.681284 43.573576 0.614904 0.339668
{'name': 'MPI_Allreduce', 'type': 'function'} MPI_Allreduce 108.66456642370434 + 0.41649782186503276 * p^(... 5.232565e+06 0.681661 43.577591 0.614677 0.339689

(For every node, sub-selected metric combination)

[7]:
with pd.option_context("display.max_colwidth", 1):
    display(HTML(mdl.to_html()))

The 1st node {"name": "MPI_Allreduce", "type": "function"}, has an interesting graph so we want to retrieve its model. This can be achieved by indexing the models_df DataFrame for our chosen node for the metric Avg time/rank_extrap-model.

[8]:
model_obj = t_ens.statsframe.dataframe.at[t_ens.statsframe.dataframe.index[0], "Total time_extrap-model"]

We can evaluate the model at a value like a function.

[9]:
model_obj.eval(600)

[9]:
9.311422624087946

It returns a figure and an axis objects. The axis object can be used to adjust the plot, i.e., change labels. The display() function requires an input for RSS (bool), that determines whether to display Extra-P RSS on the plot.

[10]:
plt.clf()
fig, ax = model_obj.display(RSS=False)
plt.show()
<Figure size 640x480 with 0 Axes>
_images/extrap-with-metadata-aggregated_19_1.png