Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]Use the latest version #780

Open
weiliudong opened this issue Sep 30, 2022 · 32 comments · Fixed by #799
Open

[bug]Use the latest version #780

weiliudong opened this issue Sep 30, 2022 · 32 comments · Fixed by #799
Labels

Comments

@weiliudong
Copy link

After replacing with the latest version, the device is abnormally connected to the cluster.
"aedes": "^0.48.0"
"aedes-persistence-redis": "^9.0.1"
"mqemitter-redis": "^5.0.0"

4|mglink_mqtt | TypeError: Cannot read properties of undefined (reading 'length')
4|mglink_mqtt | at validateTopic (/root/mglink_node/node_modules/aedes/lib/utils.js:6:21)
4|mglink_mqtt | at SubState.authorize (/root/mglink_node/node_modules/aedes/lib/handlers/subscribe.js:93:15)
4|mglink_mqtt | at Holder.work (/root/mglink_node/node_modules/fastfall/fall.js:105:23)
4|mglink_mqtt | at SubState.compiled (/root/mglink_node/node_modules/fastfall/fall.js:55:17)
4|mglink_mqtt | at SubscribeState.doSubscribe (/root/mglink_node/node_modules/aedes/lib/handlers/subscribe.js:89:16)
4|mglink_mqtt | at goResultsFunc (/root/mglink_node/node_modules/fastparallel/parallel.js:63:16)
4|mglink_mqtt | at Aedes.parallel [as _parallel] (/root/mglink_node/node_modules/fastparallel/parallel.js:33:9)
4|mglink_mqtt | at handleSubscribe (/root/mglink_node/node_modules/aedes/lib/handlers/subscribe.js:78:17)
4|mglink_mqtt | at ClientPacketStatus.restoreSubs (/root/mglink_node/node_modules/aedes/lib/handlers/connect.js:180:5)
4|mglink_mqtt | at makeCallTwo (/root/mglink_node/node_modules/fastseries/series.js:150:3)

help me

@weiliudong
Copy link
Author

server linux version : Linux iZwz9d6oh1bh7ilukhile6Z 4.19.0-18-amd64 #1 SMP Debian 4.19.208-1 (2021-09-29) x86_64 GNU/Linux

@robertsLando
Copy link
Member

@weiliudong What's the previous version you was using? Do you also know what's the subscrbie action that causes this?

@weiliudong
Copy link
Author

@robertsLando
The previous version was
"aedes": "^0.46.3"
"aedes-persistence-redis": "^8.0.1"
"mqemitter-redis": "^4.2.0"

@robertsLando
Copy link
Member

Could you tell me what subscribe command throws tha?

@weiliudong
Copy link
Author

@robertsLando It should be an exception when this method is called.
function authorize (sub, done) {
const err = validateTopic(sub.topic, 'SUBSCRIBE')
if (err) {
return done(err)
}
const client = this.client
client.broker.authorizeSubscribe(client, sub, done)
}

Compare the old version

This is a lot like theme permissions
authorizeSubscribe
The problem
but i'm not sure

@robertsLando
Copy link
Member

The problem is that for some reason the topic is null/undefined when it comes there, question is why and how it could be undefined. This is why I'm asking what kind of subscribe causes that problem

@weiliudong
Copy link
Author

Received, here is to determine where the caller is, find and send it to you for analysis

@weiliudong
Copy link
Author

@robertsLando
I found the problem here, it is the problem of your module configuration
"aedes": "^0.48.0",
"aedes-persistence-redis": "^9.0.1",
When used together, the above-mentioned problems occur directly;

when I replace it with
"aedes": "^0.48.0",
"aedes-persistence-redis": "^8.0.1",
fully working fine

Conclusion: The new version of "aedes": "^0.48.0" does not fit the new version of "aedes-persistence-redis": "^9.0.1"

Currently
"aedes": "^0.48.0" and "aedes-persistence-redis": "^8.0.1" will work.

This is the problem we are currently encountering with the upgrade module

Because it is a problem with aedes internal module adaptation, we can't confirm the specific problem for you, you have to investigate.

@robertsLando
Copy link
Member

Could you give a try to

"aedes": "^0.48.0",
"aedes-persistence-redis": "^9.0.0",

@robertsLando
Copy link
Member

@weiliudong does that fixed your problem?

@rosenvivanov
Copy link

rosenvivanov commented Oct 31, 2022

Could you give a try to

"aedes": "^0.48.0",
"aedes-persistence-redis": "^9.0.0",

It's the same error. I've tried with:

  • "aedes-persistence-redis": "^9.0.1" and "^9.0.0"
  • "mqemitter-redis": "^5.0.0" and "^4.2.0"

/home/aedes-broker-tests/node_modules/standard-as-callback/built/index.js:6
throw e;
^

TypeError: Cannot read property 'length' of undefined
at validateTopic (/home/aedes-broker-tests/node_modules/aedes/lib/utils.js:6:21)
at SubState.authorize (/home/aedes-broker-tests/node_modules/aedes/lib/handlers/subscribe.js:93:15)
at Holder.work (/home/aedes-broker-tests/node_modules/fastfall/fall.js:105:23)
at SubState.compiled (/home/aedes-broker-tests/node_modules/fastfall/fall.js:55:17)
at SubscribeState.doSubscribe (/home/aedes-broker-tests/node_modules/aedes/lib/handlers/subscribe.js:89:16)
at goResultsFunc (/home/aedes-broker-tests/node_modules/fastparallel/parallel.js:63:16)
at Aedes.parallel [as _parallel] (/home/aedes-broker-tests/node_modules/fastparallel/parallel.js:33:9)
at handleSubscribe (/home/aedes-broker-tests/node_modules/aedes/lib/handlers/subscribe.js:78:17)
at ClientPacketStatus.restoreSubs (/home/aedes-broker-tests/node_modules/aedes/lib/handlers/connect.js:180:5)
at makeCallTwo (/home/aedes-broker-tests/node_modules/fastseries/series.js:150:3)

@robertsLando
Copy link
Member

@rosenvivanov could you create a script to reproduce this error please?

@rosenvivanov
Copy link

rosenvivanov commented Oct 31, 2022

@rosenvivanov could you create a script to reproduce this error please?

Actually, something mess related to the node folder. With deletion and clear npm install, now it's working. I'll test more and will let you know.

@robertsLando
Copy link
Member

ok thanks!

@rosenvivanov
Copy link

rosenvivanov commented Nov 1, 2022

ok thanks!

I've tested and I haven't managed to get the above error. We can consider that it has been something broken in the node modules.

But I noticed something other:

Event connectionError: null
Error: read ECONNRESET
at TCP.onStreamRead (internal/stream_base_commons.js:209:20) {
errno: -104,
code: 'ECONNRESET',
syscall: 'read'
}

This is printed by below event every 20-30 sec

aedes.on('connectionError', function (client, error) { console.log('Event connectionError: \x1b[31m' + (client ? client.id : 'undefined') + '\x1b[0m'); console.log(error); })

@robertsLando
Copy link
Member

robertsLando commented Nov 2, 2022

That error is emitted here

ECONNRESET error is emitted when:

the other side of the TCP conversation abruptly closed its end of the connection. This is most probably due to one or more application protocol errors.

Essentially one of your clients connecting to broker closes the connection before ending the connect, for some unknown reason

@weiliudong
Copy link
Author

@robertsLando It is normal for the device side to have jitter due to poor network connection, but as aedes, we should not interrupt the main process of the entire program for this reason, but throw the exception to the application layer for processing.

The production environment has already deployed a lot of aedes, and they dare not upgrade to the new version.

For this length exception, most of them should be attributable to protocol errors, resulting in aede not being able to get the topic, unable to parse the corresponding topic length, and causing the program to crash.

@weiliudong
Copy link
Author

weiliudong commented Nov 2, 2022

This is the part I made by downloading the source code of aedes and then modifying it so that the new version of aedes
"aedes": "^0.48.0",
"aedes-persistence-redis": "^9.1.0",
to work together

function handleSubscribe(client, packet, restore, done) {
    packet.subscriptions = replace(client, packet)// Subscribe while processing connection
    packet.subscriptions = packet.subscriptions.length === 1 ? packet.subscriptions : _dedupe(packet.subscriptions)
    client.broker._parallel(
        new SubscribeState(client, packet, restore, done), // what will be this in the functions
        doSubscribe, // function to call
        packet.subscriptions, // first argument of the function
        restore ? done : completeSubscribe // the function to be called when the parallel ends
    )

    function replace(client, packet) {
        const {id} = client
        const array = []
        const {subscriptions} = packet
        for (const item of subscriptions) {
            if (item === 49) {
                array.push({topic: `${id}/connect`, qos: 0})
            } else {
                array.push(item)
            }
        }
        return array
    }
}

@robertsLando
Copy link
Member

It is normal for the device side to have jitter due to poor network connection, but as aedes, we should not interrupt the main process of the entire program for this reason, but throw the exception to the application layer for processing.

Sure, what I don't understand is why this happens, I mean I know where the exception is thrown: https://github.com/moscajs/aedes/blob/main/lib/utils.js#L6

Because fixing that code takes 10 seconds but I have a feel the error could just be hidden in that way, topic should never be null

@robertsLando
Copy link
Member

@weiliudong Please explain this:

for (const item of subscriptions) {
            if (item === 49) {
                array.push({topic: `${id}/connect`, qos: 0})
            } else {
                array.push(item)
            }
        }

Why an item could be 49 ??

@weiliudong
Copy link
Author

@robertsLando yes

@weiliudong
Copy link
Author

We use the C code written by this connected module to connect to aedes, you can use this module to reproduce
https://mongoose.ws/tutorials/
image
you can look here

@weiliudong weiliudong reopened this Nov 2, 2022
@weiliudong
Copy link
Author

weiliudong commented Nov 2, 2022

@robertsLando

@weiliudong Please explain this:

for (const item of subscriptions) {
if (item === 49) {
array.push({topic: ${id}/connect, qos: 0})
} else {
array.push(item)
}
}
Why an item could be 49 ??
image

image

It seems that there is a problem with the action "connectActions" when connecting

What is the client.broker._series method used for?

@robertsLando
Copy link
Member

Ok so that item is something fetched from redis subscriptions? I don't remember if the error goes away by downgrading aedes persistence redis or not. (be sure to clean up redis db when doingg a new test or some errors could be due to the previous versions)

@robertsLando
Copy link
Member

robertsLando commented Nov 2, 2022

What is the client.broker._series method used for?

It uses fast-series module:

Zero-overhead series function call for node.js

@robertsLando
Copy link
Member

robertsLando commented Nov 2, 2022

So the error could be in redis addSubscriptions method: https://github.com/moscajs/aedes-persistence-redis/blob/master/persistence.js#L113

Or subscriptionsByClient (used to fetch subscriptions to restore once a client connects with clean flag set to false): https://github.com/moscajs/aedes-persistence-redis/blob/master/persistence.js#L194

Could you try to debug this to see if something is wrong there?

@weiliudong
Copy link
Author

remember

ok

@reijosirila
Copy link

reijosirila commented Dec 13, 2022

Or subscriptionsByClient (used to fetch subscriptions to restore once a client connects with clean flag set to false): https://github.com/moscajs/aedes-persistence-redis/blob/master/persistence.js#L194

We have the same issue upgrading from

"aedes": "0.46.3",
"aedes-persistence-redis": "8.0.1",

to

"aedes": "0.48.0",
"aedes-persistence-redis": "9.0.1", 

In our case it happens only with clients with clean flag false. I'm unable to reproduce this on test env with clean redis. But it happens if redis database kept between updates.

Following simple test fails if very same redis is used, and cache not cleaned, aedes updated. Client is mqtt.js promisfied.

    // start with aedes 0.46.3 and aedes-persistence-redis 9.0.1
    let service = await connect(host, { clientId: 'c', username: 'u', password: 'p', clean: false, resubscribe: false });
    await service.subscribe('device/id/+/#'); 
    await service.end(true);
    // update aedes to 0.48.0 and aedes-persistence-redis to  9.0.1 , but keep the Redis database
    let service = await connect(host, { clientId: 'c', username: 'u', password: 'p', clean: false, resubscribe: false });
    await service.subscribe('device/id/+/#');  
    await service.end(true);
    // throws TypeError: Cannot read properties of undefined (reading 'length')
    // downgrade aedes/aedes-persistence-redis or clean redis.
    let service = await connect(host, { clientId: 'c', username: 'u', password: 'p', clean: false, resubscribe: false });
    await service.subscribe('device/id/+/#'); 
    await service.end(true);
    // works fine

@robertsLando
Copy link
Member

@reijosirila Pleaase check my previous comment: #780 (comment)

I think the error is here: https://github.com/moscajs/aedes-persistence-redis/blob/master/persistence.js#L194

It could have been introduced from: https://github.com/moscajs/aedes-persistence-redis/pull/96/files

In that version the clientId has been sanitized using encodeURIComponent function like you can see here

That's why we also bumped a major for that. Could you do some tests and check if that's the reason?

@reijosirila
Copy link

reijosirila commented Dec 14, 2022

Added some debug before and after update to function subscriptionsByClient

    #### hgetall? clientSubKey? client:service-test
    #### hgetall? subs? { 'device/id/+/#': '0' }
    #### hgetall? toReturn? [ { topic: 'device/id/+/#', qos: 0 } ]
// after update
    ### hgetallBuffer clientSubKey(client.id)? client:service-test
    ### hgetallBuffer subs? { 'device/id/+/#': <Buffer 30> }
    ### hgetallBuffer toReturn? [ 48 ]

function returnSubsForClient return type is different, if data written by previous module.

{
      topic: string,
      qos: int
}[]

vs

number[]

After returning from returnSubsForClient previous type, everything works fine.
if data is written by new module, returnSubsForClient works:

    ###hgetallBuffer clientSubKey(client.id)? client:service-test
    ### hgetallBuffer subs? {
      'device/id/+/#': <Buffer 82 a5 74 6f 70 69 63 ad 64 65 76 69 63 65 2f 69 64 2f 2b 2f 23 a3 71 6f 73 00>
    }
    ### hgetallBuffer toReturn? [ { topic: 'device/id/+/#', qos: 0 } ]

@robertsLando
Copy link
Member

robertsLando commented Dec 14, 2022

@reijosirila I have just published a patch that at least should prevent broker to crash.

About the issue, the problem is that previously the subs where not encoded with msgpack, with version 9 the encode has been added so when decode is called it is called on a clean object (that was not encoded previously) and that causes the issue. Could you submit a fix to aedes-persistence-redis? You should just try to skip the decode when the object is already decoded here

@robertsLando robertsLando reopened this Dec 14, 2022
@reijosirila
Copy link

Yep. Just finished checking addSubscriptions function and this is the case. I'll try to do PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants