A Peel package bundles together the configuration data, datasets, and workload applications required for the execution of a particular collection of experiments, and is therefore shortly referred to as a Peel bundle.
The code snippets below assume that the following shell variables are set. Modify them accordingly before running the code.
# for all bundles export BUNDLE_BIN=~/bundles/bin # bundle binaries parent export BUNDLE_SRC=~/bundles/src # bundle sources parent # for the current bundle export BUNDLE_GID=com.acme # bundle groupId export BUNDLE_AID=peel-wordcount # bundle artifactId export BUNDLE_PKG=com.acme.benchmarks.wordcount # bundle root package
Tip: If you intend to maintain multiple bundles, we suggest to define
BUNDLE_SRC in your
If you don’t want to version the code and configuration data of your bundle, your best option is to download and extract the pre-packaged empty bundle archive.
wget https://github.com/stratosphere/peel/releases/download/v1.1.8/peel-empty-bundle-1.1.8.tar.gz mkdir -p "$BUNDLE_BIN/$BUNDLE_AID" tar -xzvf peel-empty-bundle-1.1.8.tar.gz -C "$BUNDLE_BIN/$BUNDLE_AID" cd "$BUNDLE_BIN/$BUNDLE_AID"
If you intend to version the code and configuration data of your bundle, your best option is to bootstrap a project structure from a Peel archetype.
cd "$BUNDLE_SRC" mvn archetype:generate -B \ -Dpackage="$BUNDLE_PKG" \ -DgroupId="$BUNDLE_GID" \ -DartifactId="$BUNDLE_AID" \ -DarchetypeGroupId=org.peelframework \ -DarchetypeArtifactId=peel-flinkspark-bundle \ -DarchetypeVersion=1.1.8 cd "$BUNDLE_AID" mvn clean deploy cd "$BUNDLE_BIN/$BUNDLE_AID"
The following archetypes are currently supported:
||A bundle with versioned workload applications for Spark & Flink.|
||A bundle with versioned workload applications for Flink only.|
||A bundle with versioned workload applications for Spark only.|
Run the Example Experiment
$BUNDLE_BIN/$BUNDLE_AID directory, run the following command:
./peel.sh suite:run wordcount.default
This will trigger the execution of an example suite which consists of two experiments running a Wordcount job on Flink and Spark respectively.
Each job will be repeated three times, and the results and raw log data for each run will be stored in the
Check the Results
Peel ships with facilities to extract, transform, and load the row data from your experiments in a relational database. To do this for the Wordcount job, run the following commands:
./peel.sh db:initialize ./peel.sh db:import wordcount.default
You can then start analyzing your experiment data with SQL queries and data analysis tools that can use a relational database as a backend.
For example, the following SQL query retrieves the min, median, and max runtime for the two experiments in the
SELECT e.suite as suite , e.name as name , MIN(r.time) as min_time , MAX(r.time) as max_time , SUM(r.time) - MIN(r.time) - MAX(r.time) as median_time FROM experiment as e , experiment_run as r WHERE e.id = r.experiment_id AND e.suite = "wordcount.default" GROUP BY e.suite, e.name ORDER BY e.suite, e.name
If you’re using the archetype method, you can run the above query directly through the custom Peel command shipped with your bundle:
./peel.sh query:runtimes wordcount.default
Interested in learning more? Check the Motivation section for a brief introduction to system experiment vocabulary and concepts and an overview of the problems Peel will solve for you. Alternatively, go directly to Bundle Basics if you want to get your hands dirty right away!