Toggle navigation
直到世界的尽头
主页
搜索
遇到问题--spark-scala---Cannot resolve overloaded method 'udf'--Defines a Scala closure of 11 arguments as user-defined function (UDF)
计算机相关
2021-09-14 19:00:31.0
# 情况 我们在使用spark进行运算时,经常需要使用udf进行自定义函数。 当我们自定义的函数参数个数小于等于10个时,udf能够正常编译运行。 例如 ``` val makeParams: (String, String, String, String, String, String, String, String, String, String) => TestProperty = (orderId: String, barcode: String, deliveryId: String, mailNo: String, expressCode: String, platformCode: String, areaName: String, productId: String, skuId: String, shopName: String String) => { val TestEvent: TestProperty = new TestProperty(orderId, barcode, deliveryId, mailNo, expressCode, platformCode, areaName, productId, skuId, shopName) TestEvent } val make_params = udf(makeParams) val result = waitWriteDataFrame .withColumn("properties", make_params('orderId, 'barcode, 'deliveryId, 'mailNo, 'expressCode, 'platformCode, 'areaName, 'productId, 'skuCode)) .withColumn("type", lit("track")) .withColumn("event_name", lit(eventName)) .select( 'orderId.alias("#account_id"), 'type.alias("#type"), 'event_name.alias("#event_name"), 'deliveryDate.alias("#time"), 'properties ).show() ``` 但是 超过10个,则会识别不到 udf参数。 报错如下: ``` Cannot resolve overloaded method 'udf' ``` # 原因 org.apache.spark.sql的udf方法目前版本只支持10个参数。 # 解决方案 使用map结构把需要传的参数封装起来,保证参数不超过10个 ``` val makeParams: (String, String, String, String, String, String, String, String, String,Map[String,String]) => TestProperty = (orderId: String, barcode: String, deliveryId: String, mailNo: String, expressCode: String, platformCode: String, areaName: String, productId: String, skuId: String, addProperty:Map[String,String]) => { val TestEvent: TestProperty = new TestProperty(orderId, barcode, deliveryId, mailNo, expressCode, platformCode, areaName, productId, skuId, addProperty("shopName"), addProperty("sellerMemo")) TestEvent } val make_params = udf(makeParams) val addProperty_other: (String, String) => Map[String,String] = (shopName: String, sellerMemo: String) => { val addProperty_map= Map("shopName"->shopName,"sellerMemo"->sellerMemo) addProperty_map } val addProperty_udf=udf(addProperty_other) val result = waitWriteDataFrame.withColumn("addProperty", addProperty_udf('shopName, 'sellerMemo)) .withColumn("properties", make_params('orderId, 'barcode, 'deliveryId, 'mailNo, 'expressCode, 'platformCode, 'areaName, 'productId, 'skuCode, 'addProperty)) .withColumn("type", lit("track")) .withColumn("event_name", lit(eventName)) .select( 'orderId.alias("#account_id"), 'type.alias("#type"), 'event_name.alias("#event_name"), 'deliveryDate.alias("#time"), 'properties ).show() ```