Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error mapping a nested format #46

Open
AndresUrregoAngel opened this issue Jul 16, 2018 · 2 comments
Open

Error mapping a nested format #46

AndresUrregoAngel opened this issue Jul 16, 2018 · 2 comments

Comments

@AndresUrregoAngel
Copy link

Hello guys,

After have load my DynamoDB table using scala when I try to retrieve a row filtered in a specific value I'm getting an error.

the full record format is :

{
  "agent_id": 14732,
  "called": "+185999999",
  "to_city": "null",
  "language": "english",
  "team_id": 15,
  "taskChannelUniqueName": "voice",
  "caller_country": "CA",
  "price": "-0.22880",
  "account_sid": "AC3bdb78f4397e3951251",
  "from": "+178049978523",
  "ivr_call_sid": "C983a7254cb0ade7862705",
  "travel_type": "international",
  "priceUnit": "USD",
  "called_zip": "null",
  "from_city": "EDMONTON",
  "caller_zip": "null",
  "from_state": "AB",
  "start_time": "2018-07-04 18:57:31",
  "site_id": "4",
  "status": "completed",
  "from_country": "CA",
  "original_task_sid": "XFT05514703a5c53ec0338a7ceb9cddf16c",
  "conference_sid": "CF6ffadad8e05be8b48d78c0704b21562c",
  "direct": "inbound",
  "to_country": "US",
  "conversations": {
    "handling_team_name": "Team 1",
    "conversation_attribute_1": "No",
    "hold_time": 12,
    "ivr_path": "other_inquiries",
    "ivr_time": 57,
    "conversation_id": "XDT55414703a5c53ec0338a7ceb9cddf16c",
    "segment": "CAe8e393775870c1ec4d765c2355009acd",
    "segment_link": [
      "https:\\/\\/d3dx7qtk6i17ic.cloudfront.net\\/production\\/2018\\/07\\/04\\/15\\/RE6eede8cf6d0a574a4f4564a32e555b38.wav"
    ],
    "external_contact": "JFLY TP",
    "case": 668132,
    "outcome": null,
    "queue": "Non-Revenue",
    "handling_team_id": 15,
    "in_business_hours": "Yes",
    "direction": "Inbound"
  },
  "to_state": "null",
  "products": {
    "product_attribute_1": "International",
    "brand": "JustFly",
    "product_attribute_2": "null"
  },
  "duration": "946",
  "booking_id": "108154082",
  "SourceApp": "ServPro",
  "call_sid": "CAb46c43e3e8a983a7254cb0ade7862705",
  "conversation_id": "XDT55414703a5c53ec0338a7ceb9cddf16c",
  "from_zip": "null",
  "customers": {
    "gender": "Male",
    "market_segment": "1st Time",
    "name": "Kofi Annor",
    "customer_id": 58484842,
    "business_value": "5596.19",
    "email": "[email protected]",
    "year_of_birth": 1521,
    "acquisition_date": "2017-12-15"
  },
  "department": "service",
  "external_contact": "JFLY TP",
  "call_type": "other_inquiries",
  "direction": "inbound",
  "support_case_id": 668132,
  "caller_state": "AB",
  "to_zip": "null",
  "called_country": "US",
  "end_time": "2018-07-04 19:13:17",
  "called_city": "null",
  "api_version": "2010-04-01",
  "called_state": "null",
  "agents": {
    "full_name": "Customer Name",
    "role": "agent_level_1",
    "agent_id": 14732,
    "manager": null,
    "location": null,
    "team_id": 15,
    "department": "service",
    "email": "[email protected]",
    "team_name": "Team 1"
  },
  "caller": "+19878874894",
  "caller_city": "EDMONTON",
  "to": "+15458779216"
}

Error:

at java.lang.Thread.run(Thread.java:748)
18/07/16 18:33:26 ERROR DynamoDBRelation: Failed converting item to row: {"conversation_id":"WT00014703a5c53ec0338a7ceb9cddf16c","agents":{"full_name":"Princess Naquita","role":"agent_level_1","agent_id":14732,"manager":"None","location":"None","team_id":15,"department":"service","email":"[email protected]","team_name":"Team 1"}}
scala.MatchError: StructType(StructField(agent_id,LongType,true), StructField(department,StringType,true), StructField(email,StringType,true), StructField(full_name,StringType,true), StructField(location,StringType,true), StructField(manager,StringType,true), StructField(role,StringType,true), StructField(team_id,LongType,true), StructField(team_name,StringType,true)) (of class org.apache.spark.sql.types.StructType)
at com.github.traviscrawford.spark.dynamodb.ItemConverter$$anonfun$1.apply(ItemConverter.scala:30)
at com.github.traviscrawford.spark.dynamodb.ItemConverter$$anonfun$1.apply(ItemConverter.scala:20)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
at org.apache.spark.sql.types.StructType.foreach(StructType.scala:98)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at org.apache.spark.sql.types.StructType.map(StructType.scala:98)
at com.github.traviscrawford.spark.dynamodb.ItemConverter$.toRow(ItemConverter.scala:20)
at com.github.traviscrawford.spark.dynamodb.DynamoDBRelation$$anonfun$scan$1$$anonfun$7.apply(DynamoDBRelation.scala:131)

@lockwobr
Copy link

was getting the same issue, and figured out how to get a round this... not ideal but seems to do the trick. Need to do more testing, but this might result in two reads, in which you might need to do add a persist or cach to the rdd so when it does the inferSchema its not read it again.

  import com.github.traviscrawford.spark.dynamodb._
  val tablename = "foobar"
  val region = "us-east-1"
  val totalSegments = 8
  val pageSize = 1000
  val table = spark.read.option("inferSchema", "true").json(DynamoScanner(spark.sparkContext, tablename, totalSegments, pageSize, None, None, Some(region)).toDS)

@AndresUrregoAngel
Copy link
Author

I think this was solved by an incompability of my EMR I was running back on these days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants