欢迎访问 生活随笔!

生活随笔

当前位置: 首页 > 编程语言 > java >内容正文

java

py4j.protocol.Py4JJavaError: An error occurred while calling o90.save

发布时间:2023/12/31 java 51 豆豆
生活随笔 收集整理的这篇文章主要介绍了 py4j.protocol.Py4JJavaError: An error occurred while calling o90.save 小编觉得挺不错的,现在分享给大家,帮大家做个参考.

环境:

Ubuntu19.10

anaconda3-python3.6.10

scala 2.11.8

apache-hive-3.0.0-bin

hadoop-2.7.7

spark-2.3.1-bin-hadoop2.7

java version "1.8.0_131"

Mysql Server version: 8.0.19-0ubuntu0.19.10.3 (Ubuntu)

driver:mysql-connector-java-8.0.20.jar

[Driver link|https://mvnrepository.com/artifact/mysql/mysql-connector-java/8.0.20]

使用的代码是:

import pandas as pd from pyspark.sql import SparkSession from pyspark import SparkContext from pyspark.sql import SQLContextdef map_extract(element):file_path, content = elementyear = file_path[-8:-4]return [(year, i) for i in content.split("\n") if i]spark = SparkSession\.builder\.appName("PythonTest")\.getOrCreate()res = spark.sparkContext.wholeTextFiles('hdfs://Desktop:9000/user/mercury/names',minPartitions=40) \.map(map_extract) \.flatMap(lambda x: x) \.map(lambda x: (x[0], int(x[1].split(',')[2]))) \.reduceByKey(lambda x,y:x+y)df = res.toDF(["key","num"]) #把已有数据列改成和目标mysql表的列的名字相同 # print(dir(df)) df.printSchema() print(df.show()) df.printSchema()df.write.format("jdbc").options(url="jdbc:mysql://127.0.0.1:3306/leaf",driver="com.mysql.cj.jdbc.Driver",dbtable="spark",user="appleyuchi",password="appleyuchi").mode('append').save()

 

 

提交方式是(下面两种方式都能复现bug):

①pyspark --master yarn(然后在交互是模式中输入交互式代码)

②spark-submit --master yarn --deploy-mode cluster 源码.py

③pyspark --master yarn --conf spark.executor.extraClassPath=/home/appleyuchi/bigdata/apache-hive-3.0.0-bin/lib/mysql-connector-java-8.0.20.jar

同样会报告类似的错误

 

解决方案:

https://gitee.com/appleyuchi/cluster_configuration/blob/master/物理环境配置流程-必须先看.txt

Reference:

[1]https://zhuanlan.zhihu.com/p/136777424

 

 

 

总结

以上是生活随笔为你收集整理的py4j.protocol.Py4JJavaError: An error occurred while calling o90.save的全部内容,希望文章能够帮你解决所遇到的问题。

如果觉得生活随笔网站内容还不错,欢迎将生活随笔推荐给好友。