README.md 1.2 KB

Generating the 300 SPARK configurations

In order to generate the 300 spark configurations, we use the OpenTuner library. The available range of values for all the parameters can be found in src/python/spark-config-generator.py file. To generate the 300 configs we execute the following:

mkdir configs
for i in $(seq 1 1 300)
do
	python3 spark-config-generator.py
	mv out configs/$i
done

The process may take a while. The 300 configurations will be available in the configs folder.

Known issues

spark-config-generator.py

This script generates a random SPARK configuration based on 107 SPARK parameters using the opentuner library.

When running the script, you might get the following error:

TypeError: Unicode-objects must be encoded before hashing

To fix this, first locate the installation path of opentuner package:

python3 -c "import opentuner; print(opentuner.__file__)"

Then edit the following line in file search/manipulator.py

@@ 858c858 @@
- return hashlib.sha256(repr(self.get_value(config)).encode('utf-8')).hexdigest().encode('utf-8') 
+ return hashlib.sha256(repr(self.get_value(config)).encode('utf-8')).hexdigest().encode('utf-8')