how to install spark cluster in standalone mode¶
change hostname and /etc/hosts, after that:
curl -LO "https://downloads.apache.org/spark/spark-2.4.7/spark-2.4.7-bin-hadoop2.7.tgz" curl -LO "https://apache.osuosl.org/zookeeper/zookeeper-3.6.2/apache-zookeeper-3.6.2-bin.tar.gz" tar zxvf *.tgz *.tar.gz mv apache-zookeeper-3.6.2-bin /usr/local/zookeeper mv spark-2.4.7-bin-hadoop2.7 /usr/local/spark yum install java-1.8.0-openjdk.x86_64 java-1.8.0-openjdk-devel.x86_64 -y echo -e 'export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.282.b08-1.el7_9.x86_64\nexport PATH=$JAVA_HOME/bin:$PATH' >> /etc/profile source /etc/profile cd /usr/local/zookeeper/bin ./zkServer.sh start cd /usr/local/spark cat conf/ha.conf spark.deploy.recoveryMode=ZOOKEEPER spark.deploy.zookeeper.url=spark3:2181 spark.deploy.zookeeper.dir=/usr/local/spark/ha ./sbin/start-master.sh -h spark1 --properties-file conf/ha.conf ./sbin/start-master.sh -h spark2 --properties-file conf/ha.conf ./sbin/start-slave.sh spark://spark1:7077,spark2:7077 -c 6 -m 12G -d /usr/local/spark/workreferences:
https://livebook.manning.com/book/spark-in-action/chapter-11/50