-
Notifications
You must be signed in to change notification settings - Fork 0
feat: append OTP info to output #12
Conversation
@barbeau This is a working sample using Kotlin flow that converts the callbacks to suspending functions. I'm still figuring out the way to make batches of requests and zip them. In all the examples here, they create Also, the CI is gonna fail because the library is only in |
Update: @barbeau I pushed a change that is still synchronous but it's way faster than it used to be. I'm transforming the Chicago list to flow using the inbuild filter and collection to remove location values and perform callbacks on the IO thread. Here's how I measured the efficiency;-
|
Nice! Could you add an average value at the bottom of each column? That should be easier to see the differences. |
Also, typically you'd throw out the first value because of JIT - it will always be much larger than the rest because some of the code get's compiled on the fly. |
@barbeau Apologies, just realized I was feeding two longitude values but it looks like it doesn't affect the processing times although the print takes longer since they actually have data to write. |
@barbeau I'm having trouble emitting multiple values and collecting as one since this is what we're trying to do if the order of the data doesn't matter. However, the Kotlin |
If I'm understanding correctly, I think that's expected behavior - For producer/consumer problem, have you looked at Channels? https://kotlinlang.org/docs/channels.html#building-channel-producers To step back a minute, I think we have three things we're trying to accomplish for out-of-order execution:
The flow solution at https://stackoverflow.com/questions/60551996/wait-for-result-from-multiple-callbacks-lambdas-in-kotlin/60556171#60556171 is close, but if you're emitting 10 requests and then collecting them, I believe you'll need to wait for all 10 to finish before batching the next 10 (unless you have more than one flow active at a time). To keep this simple and make some progress towards a working solution, let's just implement the above flow solution first that blocks until the X requests finish, and then we can work on a parallel producer/consumer implementation in another PR. |
@barbeau Thanks for the explanation. To clarify, should we try to emit multiple calls and collect the response whenever it gets one, irrespective of the order? However, I'm having trouble dynamically creating multiple flows and combining them as one to collect them later on the thread. |
Ok - can you push that code (and comment out or remove other unused code for now) and I can take a look? |
@barbeau I figured out the solution to this problem finally. It was right in front of me all along. I'm converting the list to a flow and then I'm giving another flow to the
The I tested the concurrency by appending the index of the data to the additional properties map in the |
@barbeau I have updated the |
@@ -38,4 +39,11 @@ data class ChicagoTncData | |||
@Parsed val pickupCentroidLongitude: Double = 0.0, | |||
@Parsed val dropoffCentroidLatitude: Double = 0.0, | |||
@Parsed val dropoffCentroidLongitude: Double = 0.0, | |||
@Parsed var totalTravelTime: Int? = 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we want these metrics for the top 3 trip plan results, IIRC? So for this to output cleanly to CSV I think we'd need to label these "...1", and then have "...2", and "...3" for all the fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh - I get it. What if there are less than 3 ways to reach a destination? I guess they'll hold the default values then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick clarification, should the top 3 be as-is or should we sort by travel time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick clarification, should the top 3 be as-is or should we sort by travel time?
I would leave them sorted as-returned by OTP, because they are ordered by OTP's preference.
We should probably include somehow the priority preference used in the OTP requests, maybe as new model field? QUICK, SAFEST, etc. This obviously impacts the sorting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, a common model field for all the 3 trip plan results that shows what the OptimizeType
was.
I'd say let's do it all in this PR. We probably won't know for sure that the model objects are correct until we actually output it. |
Remove ability to output a GTFS dataset based on TNC data input. This feature is no longer a focus of this project.
And print trip ID for each trip
Add logging Also don't fill and return chicagoTncData - just fill, which means it can be a val instead of var
Also convert more files to Kotlin
It's needed this was for univocity CSV exporter to work
CI is failing because CUTR-at-USF/opentripplanner-client-library#14 hasn't been resolved (that library artifact hasn't been published yet), but I'm going to merge anyway as this is a large PR and the main feature of this application. I'll open an issue separately for getting CI working again. |
PR to append the output to the Chicago ride-hailing data, discussed in #11. (WIP)
TODO: