hadoop报错总结01:https://blog.csdn.net/qq_19968255/article/details/82803768
1.当脚本在运行时报错信息如下:
Examining task ID: task_201201061122_0007_m_000002 (and more) from job job_201201061122_0007
Exception in thread "Thread-23" java.lang.RuntimeException: Error while reading from task log url
at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:130)
at org.apache.hadoop.hive.ql.exec.JobDebugger.showJobFailDebugInfo(JobDebugger.java:211)
at org.apache.hadoop.hive.ql.exec.JobDebugger.run(JobDebugger.java:81)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: Server returned HTTP response code: 400 for URL: http://10.200.187.27:50060/tasklog?taskid=attempt_201201061122_0007_m_000000_2&start=-8193
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1436)
at java.net.URL.openStream(URL.java:1010)
at org.apache.hadoop.hive.ql.exec.errors.TaskLogProcessor.getErrors(TaskLogProcessor.java:120)
... 3 more
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
MapReduce Jobs Launched:
将http://xxx:50060/tasklog?taskid=attempt_201201061122_0007_m_000000_2&start=-8193这段复制出来,输入到IE浏览器的地址栏内,然后出现这样的信息:
运行一次hadoop的时候出现java heap error。字面意思分配堆的时候出现错误,我们知道应用程序的动态内存的分配均在堆里面。这里提示堆错误,那必然是内存不够用了。那么这个namenode内存的大小该怎么取值呢?
namenode管理着集群里面所有文件的信息。简单根据文件信息给出一个准确计算内存大小的公式是不现实的。
hadoop默认namenode内存的大小为1000M,这个值对于数百万的文件来说是足够的,可以保守地设置每百万数据块需要1000MB内存。
例如,有这样一个场景,一个含有200个节点的集群,每个节点有一个24TB的磁盘,hadoop的block的大小为128MB,有三份拷贝总共需要块的数目大概在200万或者更多,那么内存大致需要多少?
首先计算可以有多少块:
(200*24000000MB)/(128MB*3)=12500,000。
然后保守估计需要多少内存:
12500,000*1000MB/1000,000=12,500MB
从上面的计算结果看出,将namenode内存的大小设置为12,000MB这个数量级别可以满足。
计算大致的值之后,怎么设置呢?
hadoop配置文件,hadoop-env.sh中有个选项HADOOP_NAMENODE_OPTS,此JVM选项是用来设置内存大小的。比如:
HADOOP_NAMENODE_OPTS=-Xmx2000m
那么就是给namenode分配了2000MB的空间。
如果改变了namenode的内存大小,那么secondarynamenode的内存的大小同样也要改变,其选项是HADOOP_SECONDARYNAMENODE_OPTS。
sqoop The driver has not received any packets from the server
执行 list-tables 和 list_databases都OK,但是import有问题,猜测是MAP会分发到其他两个hadoop节点上,也会连接mysql,估计还是mysql的权限问题。
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2.jdbc.url=jdbc:mysql://localhost:3306/totosea?useUnicode=true&characterEncoding=UTF-8&autoReconnect=true&failOverReadOnly=false
autoReconnect
当数据库连接异常中断时,是否自动重新连接?
failOverReadOnly
自动重连成功后,连接是否设置为只读?
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
3.Hive保留关键字的支持
Failed to recognize predicate 'date'. Failed rule: 'identifier' in column specification
不使用此关键字
conf->hive-site.xml
<property>
<name>hive.support.sql11.reserved.keywords</name>
<value>false</value>
</property>
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
4.Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep解决方法
14/03/26 23:10:04 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/03/26 23:10:05 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/03/26 23:10:06 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
14/03/26 23:10:07 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:10020. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
在使用sqoop工具对hive表导出到mysql中提示该信息,一直重试。通过网上查阅资料,这个问题是指定hdfs路径不严谨导致。
报错写法
sqoop export --connect jdbc:mysql://c6h2:3306/log --username root --password 123 --table dailylog --fields-terminated-by '