Apache Zeppelin——交互式数据分析和可视化利器

2022-06-01 08:27:48 浏览数 (1)

简介

用于做数据分析和可视化,下面的演示是以0.5.6版本安装的,截止目前最新版本为0.9.0。

安装

二进制安装

1)下载二进制包

代码语言:javascript复制
wget http://mirrors.tuna.tsinghua.edu.cn/apache/incubator/zeppelin/0.5.6-incubating/zeppelin-0.5.6-incubating-bin-all.tgz

2)解压缩

代码语言:javascript复制
tar -xzvf zeppelin-0.5.6-incubating-bin-all.tgz

cd zeppelin-0.5.6-incubating-bin-all

bin/zeppelin-daemon.sh start

注: 默认端口是8080,若此端口被占用,到conf下

代码语言:javascript复制
cp zeppelin-site.xml.template zeppelin-site.xml

vim zeppelin-site.xml
代码语言:javascript复制
<property>

  <name>zeppelin.server.addr</name>

  <value>172.16.1.29</value>

  <description>Server address</description>

</property>



<property>

  <name>zeppelin.server.port</name>

  <value>8080</value>

  <description>Server port.</description>

</property>

修改zeppelin.server.port端口,zeppelin.server.addr默认是0.0.0.0可以不修改,也可以修改成本机ip(云服务器一定要ip addr看一下本机的ip,而不是绑定的外网ip),java版本1.7。

访问Zeppelin

localhost:8080 访问到zepplin主页。

注: 1.主界面默认端口为8080,若此端口被占用,则启动会出错。可以到conf目录下

代码语言:javascript复制
# cp zeppelin-site.xml.template zeppelin-site.xml

修改其中的参数

代码语言:javascript复制
<property>

  <name>zeppelin.server.addr</name>

  <value>0.0.0.0</value>

  <description>Server address</description>

</property>



<property>

  <name>zeppelin.server.port</name>

  <value>9090</value>

  <description>Server port.</description>

</property>

修改zeppelin.server.port

Zeppelin.server.addr可以默认0.0.0.0也可以修改成本地ip

源码安装

1)下载源码包 Zeppelin 0.5.6-incubating:

代码语言:javascript复制
wget http://mirror.bit.edu.cn/apache/incubator/zeppelin/0.5.6-incubating/zeppelin-0.5.6-incubating.tgz

Zeppelin 0.6.0-SNAPSHOT:

代码语言:javascript复制
git clone https://github.com/apache/zeppelin.git

2)配置环境

Requirements

  • Git
  • Java 1.7
  • Tested on Mac OSX, Ubuntu 14.X, CentOS 6.X, Windows 7 Pro SP1
  • Maven (if you want to build from the source code)
  • Node.js Package Manager (npm, downloaded by Maven during build phase)

编译环境构建

git install

代码语言:javascript复制
# git version
git version 1.7.1

install jdk

代码语言:javascript复制
# wget http://download.oracle.com/otn-pub/java/jdk/7u79-b15/jdk-7u79-linux-x64.tar.gz

# tar -zxf jdk-7u79-linux-x64.tar.gz -C /opt/

# cd /opt/

# ln -s jdk1.7.0_79 jdk

# vim ~/.bash_profile 追加
 export JAVA_HOME=/opt/jdk

 export PATH=.:$JAVA_HOME/bin:$PATH

 export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

# source ~/.bash_profile
# java -version
java version "1.7.0_79"
Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)

install maven

代码语言:javascript复制
# wget http://www.eu.apache.org/dist/maven/maven-3/3.3.3/binaries/apache-maven-3.3.3-bin.tar.gz

# tar -zxf apache-maven-3.3.3-bin.tar.gz

# ln -s apache-maven-3.3.3 maven

# echo "export MAVEN_HOME=/opt/maven" >> ~/.bash_profile

# echo "export PATH=$MAVEN_HOME/bin:$PATH:$HOME/bin" >> ~/.bash_profile

# source  ~/.bash_profile
# mvn -version
Apache Maven 3.3.3 (7994120775791599e205a5524ec3e0dfe41d4a06; 2015-04-22T19:57:37 08:00)
Maven home: /opt/maven
Java version: 1.7.0_79, vendor: Oracle Corporation
Java home: /opt/jdk1.7.0_79/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "2.6.32-504.el6.x86_64", arch: "amd64", family: "unix"

install node.js

代码语言:javascript复制
yum install http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm

yum repolist

# yum search nodejs npm|wc -l
21

# sudo yum install nodejs npm --enablerepo=epel

# node -v
v0.10.42

# npm -v
1.3.6

# cd /data/

build zeppline

代码语言:javascript复制
# cd /data/

# wget https://github.com/apache/zeppelin/archive/v0.5.6.zip
# unzip v0.5.6.zip
# cd zeppelin-0.5.6/
# nohup mvn clean package -Pspark-1.6 -Phadoop-2.6 -Pyarn -Ppyspark -DskipTests > nohup.out &
# jobs
[1]   Running                 nohup mvn clean package -Pspark-1.6 -Phadoop-2.6 -Pyarn -Ppyspark -DskipTests > nohup.out &

# tail -f nohup.out

参考地址:

https://github.com/apache/zeppelin/

FAQ

1.Exception in thread "main" Exception: java.lang.OutOfMemoryError thrown

解决办法:

代码语言:javascript复制
export MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=512m"
代码语言:javascript复制
[INFO] Zeppelin: Elasticsearch interpreter ................ SUCCESS [15:56 min]
[INFO] Zeppelin: web Application .......................... FAILURE [03:51 min]
[INFO] Zeppelin: Server ................................... SKIPPED
[INFO] Zeppelin: Packaging distribution ................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 20:57 min
[INFO] Finished at: 2016-06-08T02:19:40-04:00
[INFO] Final Memory: 93M/957M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal com.github.eirslett:frontend-maven-plugin:0.0.23:npm (npm install) on project zeppelin-web: Failed to run task: 'npm install --color=false' failed. (error code 126) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :zeppelin-web

解决办法:

网上查找修改zeppelin-web下的pom.xml

代码语言:javascript复制
<execution>
      <id>npm install</id>
      <goals>
        <goal>npm</goal>
      </goals>
    </execution>

     <execution>
      <id>bower install</id>
      <goals>
          <goal>bower</goal>
      </goals>
      <configuration>
        <arguments>--allow-root install</arguments>
      </configuration>
    </execution>

  <execution>
      <id>grunt build</id>
      <goals>
          <goal>grunt</goal>
      </goals>
      <configuration>
        <arguments>--no-color --force</arguments>
      </configuration>
    </execution>
代码语言:javascript复制
#  npm install
#  bower –alow-root install
#  grunt –force
#  mvn install -DskipTests

启动zeppelin

代码语言:javascript复制
# cd zeppelin-0.5.6-incubating
# bin/zeppelin-daemon.sh start
Log dir doesn't exist, create /opt/bigcrh/zeppelin/src/zeppelin-0.5.6-incubating/logs
Pid dir doesn't exist, create /opt/bigcrh/zeppelin/src/zeppelin-0.5.6-incubating/run
Zeppelin start                                             [  OK  ]

# jps
18710 ZeppelinServer

0 人点赞