diff options
author | Tvrtko Ursulin <tvrtko.ursulin@intel.com> | 2017-03-30 14:32:29 +0100 |
---|---|---|
committer | Tvrtko Ursulin <tvrtko.ursulin@intel.com> | 2017-04-25 14:53:46 +0100 |
commit | 054eb1abecd1cce2e4ee0516f3ff8a67a35dca22 (patch) | |
tree | 8083331eb239e3c5f49d776adac040f3fffff717 /benchmarks/wsim | |
parent | cf6f2c9be161e3ca6dd210f4d576cea52883c6bc (diff) |
benchmarks/gem_wsim: Command submission workload simulator
Tool which emits batch buffers to engines with configurable
sequences, durations, contexts, dependencies and userspace waits.
Unfinished but shows promise so sending out for early feedback.
v2:
* Load workload descriptors from files. (also -w)
* Help text.
* Calibration control if needed. (-t)
* NORELOC | LUT to eb flags.
* Added sample workload to wsim/workload1.
v3:
* Multiple parallel different workloads (-w -w ...).
* Multi-context workloads.
* Variable (random) batch length.
* Load balancing (round robin and queue depth estimation).
* Workloads delays and explicit sync steps.
* Workload frequency (period) control.
v4:
* Fixed queue-depth estimation by creating separate batches
per engine when qd load balancing is on.
* Dropped separate -s cmd line option. It can turn itself on
automatically when needed.
* Keep a single status page and lie about the write hazard
as suggested by Chris.
* Use batch_start_offset for controlling the batch duration.
(Chris)
* Set status page object cache level. (Chris)
* Moved workload description to a README.
* Tidied example workloads.
* Some other cleanups and refactorings.
v5:
* Master and background workloads (-W / -w).
* Single batch per step is enough even when balancing. (Chris)
* Use hars_petruska_f54_1_random IGT functions and see to zero
at start. (Chris)
* Use WC cache domain when WC mapping. (Chris)
* Keep seqnos 64-bytes apart in the status page. (Chris)
* Add workload throttling and queue-depth throttling commands.
(Chris)
v6:
* Added two more workloads.
* Merged RT balancer from Chris.
v7:
* Merged NO_RELOC patch from Chris.
* Added missing RT balancer to help text.
TODO list:
* Fence support.
* Batch buffer caching (re-use pool).
* Better error handling.
* Less 1980's workload parsing.
* More workloads.
* Threads?
* ... ?
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Diffstat (limited to 'benchmarks/wsim')
-rw-r--r-- | benchmarks/wsim/README | 56 | ||||
-rw-r--r-- | benchmarks/wsim/media_17i7.wsim | 7 | ||||
-rw-r--r-- | benchmarks/wsim/media_19.wsim | 10 | ||||
-rw-r--r-- | benchmarks/wsim/media_load_balance_17i7.wsim | 7 | ||||
-rw-r--r-- | benchmarks/wsim/media_load_balance_19.wsim | 10 | ||||
-rw-r--r-- | benchmarks/wsim/vcs1.wsim | 26 | ||||
-rw-r--r-- | benchmarks/wsim/vcs_balanced.wsim | 26 |
7 files changed, 142 insertions, 0 deletions
diff --git a/benchmarks/wsim/README b/benchmarks/wsim/README new file mode 100644 index 00000000..7aa0694a --- /dev/null +++ b/benchmarks/wsim/README @@ -0,0 +1,56 @@ +Workload descriptor format +========================== + +ctx.engine.duration_us.dependency.wait,... +<uint>.<str>.<uint>[-<uint>].<int <= 0>.<0|1>,... +d|p|s.<uiny>,... + +For duration a range can be given from which a random value will be picked +before every submit. Since this and seqno management requires CPU access to +objects, care needs to be taken in order to ensure the submit queue is deep +enough these operations do not affect the execution speed unless that is +desired. + +Additional workload steps are also supported: + + 'd' - Adds a delay (in microseconds). + 'p' - Adds a delay relative to the start of previous loop so that the each loop + starts execution with a given period. + 's' - Synchronises the pipeline to a batch relative to the step. + 't' - Throttle every n batches + 'q' - Throttle to n max queue depth + +Engine ids: RCS, BCS, VCS, VCS1, VCS2, VECS + +Example (leading spaces must not be present in the actual file): +---------------------------------------------------------------- + + 1.VCS1.3000.0.1 + 1.RCS.500-1000.-1.0 + 1.RCS.3700.0.0 + 1.RCS.1000.-2.0 + 1.VCS2.2300.-2.0 + 1.RCS.4700.-1.0 + 1.VCS2.600.-1.1 + p.16000 + +The above workload described in human language works like this: + + 1. A batch is sent to the VCS1 engine which will be executing for 3ms on the + GPU and userspace will wait until it is finished before proceeding. + 2-4. Now three batches are sent to RCS with durations of 0.5-1.5ms (random + duration range), 3.7ms and 1ms respectively. The first batch has a data + dependency on the preceding VCS1 batch, and the last of the group depends + on the first from the group. + 5. Now a 2.3ms batch is sent to VCS2, with a data dependency on the 3.7ms + RCS batch. + 6. This is followed by a 4.7ms RCS batch with a data dependency on the 2.3ms + VCS2 batch. + 7. Then a 0.6ms VCS2 batch is sent depending on the previous RCS one. In the + same step the tool is told to wait for the batch completes before + proceeding. + 8. Finally the tool is told to wait long enough to ensure the next iteration + starts 16ms after the previous one has started. + +When workload descriptors are provided on the command line, commas must be used +instead of new lines. diff --git a/benchmarks/wsim/media_17i7.wsim b/benchmarks/wsim/media_17i7.wsim new file mode 100644 index 00000000..5f533d8e --- /dev/null +++ b/benchmarks/wsim/media_17i7.wsim @@ -0,0 +1,7 @@ +1.VCS1.3000.0.1 +1.RCS.1000.-1.0 +1.RCS.3700.0.0 +1.RCS.1000.-2.0 +1.VCS2.2300.-2.0 +1.RCS.4700.-1.0 +1.VCS2.600.-1.1 diff --git a/benchmarks/wsim/media_19.wsim b/benchmarks/wsim/media_19.wsim new file mode 100644 index 00000000..f210d794 --- /dev/null +++ b/benchmarks/wsim/media_19.wsim @@ -0,0 +1,10 @@ +0.VECS.1400-1500.0.0 +0.RCS.1000-1500.-1.0 +s.-2 +2.VCS2.50-350.0.1 +1.VCS1.1300-1400.0.1 +0.VECS.1400-1500.0.0 +0.RCS.100-300.-1.1 +2.RCS.1300-1500.0.0 +2.VCS2.100-300.-1.1 +1.VCS1.900-1400.0.1 diff --git a/benchmarks/wsim/media_load_balance_17i7.wsim b/benchmarks/wsim/media_load_balance_17i7.wsim new file mode 100644 index 00000000..25a69203 --- /dev/null +++ b/benchmarks/wsim/media_load_balance_17i7.wsim @@ -0,0 +1,7 @@ +1.VCS.3000.0.1 +1.RCS.1000.-1.0 +1.RCS.3700.0.0 +1.RCS.1000.-2.0 +1.VCS.2300.-2.0 +1.RCS.4700.-1.0 +1.VCS.600.-1.1 diff --git a/benchmarks/wsim/media_load_balance_19.wsim b/benchmarks/wsim/media_load_balance_19.wsim new file mode 100644 index 00000000..03890776 --- /dev/null +++ b/benchmarks/wsim/media_load_balance_19.wsim @@ -0,0 +1,10 @@ +0.VECS.1400-1500.0.0 +0.RCS.1000-1500.-1.0 +s.-2 +1.VCS.50-350.0.1 +1.VCS.1300-1400.0.1 +0.VECS.1400-1500.0.0 +0.RCS.100-300.-1.1 +1.RCS.1300-1500.0.0 +1.VCS.100-300.-1.1 +1.VCS.900-1400.0.1 diff --git a/benchmarks/wsim/vcs1.wsim b/benchmarks/wsim/vcs1.wsim new file mode 100644 index 00000000..9d3e682b --- /dev/null +++ b/benchmarks/wsim/vcs1.wsim @@ -0,0 +1,26 @@ +t.5 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 +0.VCS1.500-2000.0.0 diff --git a/benchmarks/wsim/vcs_balanced.wsim b/benchmarks/wsim/vcs_balanced.wsim new file mode 100644 index 00000000..e8958b8f --- /dev/null +++ b/benchmarks/wsim/vcs_balanced.wsim @@ -0,0 +1,26 @@ +q.5 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 +0.VCS.500-2000.0.0 |