Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

emitting objects as they are parsed #4

Open
dominictarr opened this issue Sep 23, 2011 · 5 comments
Open

emitting objects as they are parsed #4

dominictarr opened this issue Sep 23, 2011 · 5 comments

Comments

@dominictarr
Copy link

as discussed on the mailing list, http://groups.google.com/group/nodejs/browse_thread/thread/1c0eac0ba0f04737/af9f180ac34ab264?lnk=gst&q=event-stream#af9f180ac34ab264

it would be great to get a streaming parser that could emit js objects as soon as they are available. this would greatly aid rapid responses to JSON apis.

most JSON apis return a response with an array of objects.
I propose emitting the elements of this array on the 'data' event.

I've forked this project and i'll see if I can get something working.

also, i've opened a similar issue here: creationix/jsonparse#1

@dominictarr
Copy link
Author

first thing I gotta say: your code is very nicely written.

okay, I've got a basic implementation here:
https://github.com/dominictarr/node-json-streams

the hardest part was that I had was that set(a) was be being called when the object starts to be parsed, which is not the right time to emit it. so, I had to move that bit for arrays and objects.

what i've done here is just a quick hack, so I wouldn't want you to pull this, but it I think this is heading in the right direction, what is needed is a more elegant way to inform the parent object that a list type is completely parsed.

basicaly, I just push a function on to the next array that calls set just before Parser.expect is called. it returns true, so that the input is not consumed.

what do you think?

@dominictarr
Copy link
Author

this does raise a sticky issue about which object to emit data on.

currently, i'm just emitting the elements of the first array that turns up in the stream.

this is good for being simple, but bad,
when you get an amorphous object that happens to have an array somewhere in it.

I think a better approach, moving forwards, is to specify the path to the array who's members are to be streamed.

 new ParseStream(['rows'])//emit every item in the 'rows' property of the root object
 new ParseStream([])      //emit every item in the root object
 new ParseStream([])      //just emit the root object (will only produce one 'data' emit)

@Floby
Copy link
Owner

Floby commented Sep 23, 2011

Thanks! I'm glad to see that someone is here to give me a motivational boost on this project =)

I'm not very sure about the name of the event. data is meant to emit sequential an unified data. In fact I'm not sure the parser should be a readable stream at all. I thought of it as end of the queue for a node.js pipeline. I think the only way to make it a readable stream is to make it transparent and just emit the data as it receives it.

But! we definitely need to have finer events. What I thought of was:

  • keep a track of the "JSONPath" of the current object or array or whatever being parsed
  • emit events with this very JSONPath as the name of the event, regardless of whether someone is listening or not.
  • allow to bind listeners to sub objects so that you don't have to know the full JSONpath to a deeply nested value

This may require to hook into the event system. So this is why I put the set() for arrays and objects at the beginning of the parsing. so that the listener can say "okay, for this particular object|array I want to be notified of everything added to it".

does that make sense? I am not aware of all the use cases of this so I might be completely wrong. For example, if the parser is used for filtering in a pipeline, it might be better to let the "observer" of the stream to emit data events itself.

I really don't know I need some advice on this

@ghost ghost assigned Floby Sep 23, 2011
@dominictarr
Copy link
Author

okay, that is a fair point.

a parser is a general purpose thing, what I am proposing is a specialized use-case.
with finer events, it would be possible to be to write a readable stream using this, and i'd be happy.
basically, what you need is the value and the current path.

so, is my couchdb example, for a row of a view request it might look like this:

  {
  value:
    { id: 'zsock',
      key: 'zsock',
      value: { rev: '16-4f975b91f0f9c2d2a2501e362401c368' } }
  path: ['rows', 102],
  root: rootObject
}

of course,
value could be primitive,
and for the root object,
path would be []

then, that could be used as an input for any kinda XPath sort of thing, like https://github.com/s3u/JSONPath for example

@dominictarr
Copy link
Author

okay, I just opened two issues on JSON selector libraries, so well see what they think.

lloyd/JSONSelect#26

JSONPath-Plus/JSONPath#3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants