how to install spark cluster in standalone mode¶

change hostname and /etc/hosts, after that:

curl -LO "https://downloads.apache.org/spark/spark-2.4.7/spark-2.4.7-bin-hadoop2.7.tgz"
curl -LO "https://apache.osuosl.org/zookeeper/zookeeper-3.6.2/apache-zookeeper-3.6.2-bin.tar.gz"
tar zxvf *.tgz *.tar.gz
mv apache-zookeeper-3.6.2-bin /usr/local/zookeeper
mv spark-2.4.7-bin-hadoop2.7 /usr/local/spark
yum install java-1.8.0-openjdk.x86_64 java-1.8.0-openjdk-devel.x86_64 -y
echo -e 'export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.282.b08-1.el7_9.x86_64\nexport PATH=$JAVA_HOME/bin:$PATH' >> /etc/profile
source /etc/profile
cd /usr/local/zookeeper/bin
./zkServer.sh start
cd /usr/local/spark
cat conf/ha.conf
spark.deploy.recoveryMode=ZOOKEEPER
spark.deploy.zookeeper.url=spark3:2181
spark.deploy.zookeeper.dir=/usr/local/spark/ha
./sbin/start-master.sh -h spark1 --properties-file conf/ha.conf
./sbin/start-master.sh -h spark2 --properties-file conf/ha.conf
./sbin/start-slave.sh spark://spark1:7077,spark2:7077 -c 6 -m 12G -d /usr/local/spark/work

references:

https://livebook.manning.com/book/spark-in-action/chapter-11/50