QMK protocol for interacting with other systems #895

jackhumbert · 2016-11-22T03:38:28Z

#692 discussed this a little, but I've begun to implement something using Midi's SysEx protocol, and would like to get some feedback on it before it gets too many eyeballs. Because it'll be supported on Midi, the values we're dealing with are 0x00 - 0x7F. I think it'd be possible to encode/decode to standard bytes (this is already being done for the data portions), but I'm not really sure it's worth it - I'm open to suggestions either way, but it won't really affect the discussion of how the protocol should be structured (other than max values, etc).

After experimenting with it for a couple days, I've come to classify every possible action into these categories - if you know of a use that is outside of this, let me know:

Getting data from the keyboard
Setting data on the keyboard
Sending data/action from the keyboard
Executing actions on the keyboard

Getting data from the keyboard

Things like default layer, rgb values/settings, backlight settings, eeprom stuff.

Setting data on the keyboard

Basically the same as the gets. These functions would also return the values they set, to be read by the application as an ACK.

Sending data/action from the keyboard

A big thing here would be unicode signals, but also special non-HID keycodes (extra modifiers), special functions to be executed on the OS/other system.

Executing actions on the keyboard

This could be broke down into the three levels we currently have (quantum, keyboard, user), with non-overlapping (or overlapping, like process_keycode) spaces.

After establishing those spaces, it's really just a matter of assign numbers (bytes addresses) to all of the different things we'll need to do. You can see the beginning/example of this here - numbers were chosen to try to leave space for things that make more sense, but will eventually be defined in an enum/something similar to avoid confusion and increase readability.

For those curious, Midi SysEx has four bytes at the beginning of the packet. It starts with 0xF0, and the remaining three are manufacturer, device, model, which are ignored in my implementation (but don't have to be). It's terminated with a 0xF7. These bytes will be stripped out by the time we get to processing this, so it's not really relevant to the discussion.

If you wanna see/try out the tray app I'm working on, you can see that here (Windows only, visual studio, c#).

The text was updated successfully, but these errors were encountered:

wilba · 2016-11-22T04:51:08Z

Cool.

I'd recommend not be tied down to SysEx's 7-bit data in the protocol. Convert all incoming 7-bit SysEx messages into an 8-bit buffer, and then process that. When sending, create an 8-bit buffer and then convert to 7-bit SysEx just before sending. Converting a buffer is more efficient and only needs to be done in one place, in QMK and in the host code. Even if buffer conversion was done inefficiently and trivially (one nibble per SysEx buffer value), it makes the protocol generic and better supports the other communication methods like serial and raw HID.

wilba · 2016-11-22T05:26:58Z

In case you're interested, here's some code I wrote for MIDIbox to do generalized 8-bit to 7-bit buffer conversion, used for doing firmware uploads via Java through SysEx messages:

http://svnmios.midibox.org/filedetails.php?repname=svn.mios&path=%2Ftrunk%2Fjava%2Forg%2Fmidibox%2Futils%2FUtils.java

jackhumbert · 2016-11-22T23:46:38Z

Awesome! Thanks - yeah, that sounds good.

jackhumbert · 2016-11-23T20:17:35Z

This is what I'm looking at for the first two bytes of the message protocol:

enum MESSAGE_TYPE {
    MT_GET_DATA =      0x10, // Get data from keyboard
    MT_GET_DATA_ACK =  0x11, // returned data to process (ACK)
    MT_SET_DATA =      0x20, // Set data on keyboard
    MT_SET_DATA_ACK =  0x21, // returned data to confirm (ACK)
    MT_SEND_DATA =     0x30, // Sending data/action from keyboard
    MT_SEND_DATA_ACK = 0x31, // returned data/action confirmation (ACK)
    MT_EXE_ACTION =    0x40, // executing actions on keyboard
    MT_EXE_ACTION_ACK =0x41, // return confirmation/value (ACK)
    MT_TYPE_ERROR =    0x80 // type not recognised (ACK)
};

enum DATA_TYPE {
    DT_NONE = 0x00,
    DT_HANDSHAKE,
    DT_DEFAULT_LAYER,
    DT_CURRENT_LAYER,
    DT_KEYMAP_OPTIONS,
    DT_BACKLIGHT,
    DT_RGBLIGHT,
    DT_UNICODE,
    DT_DEBUG,
    DT_AUDIO,
    DT_QUANTUM_ACTION,
    DT_KEYBOARD_ACTION,
    DT_USER_ACTION
};

DATA_TYPE will need to be synced between applications, but it leaves a lot of room for additional data types.

PureSpider · 2016-11-24T07:55:07Z

This article lists some limitations and reasons on why NOT to use Midi SysEx:

Unfortunately, revisiting this the other week I was left disappointed; it seems Windows 7 simply doesn’t do proper MIDI mapping to hardware MIDI devices any more, and I’m not the first one to be disappointed by it. Of course you can use dedicated software for it, but that would take away from the simplicity of the implementation. Bang goes that idea.

While I didn't read it in whole, it might still be useful to make a choice here.

wilba · 2016-11-24T08:20:06Z

We're discussing other methods in #692. HID seems the best option, once we've sorted out the messy details, but it can still work on MIDI if that's already working for @jackhumbert

jackhumbert · 2016-11-24T15:06:53Z

@PureSpider that's really better discussed in #692 - this issue is for the actual protocol discussion. This also exclusively requires special host software (the QMK helper for the Midi implementation) by design.

PureSpider · 2016-11-24T15:57:13Z

Didn't see the other ticket. My bad 🙈

jackhumbert · 2016-11-26T16:58:45Z

Does it make sense to call this an API in the code? Right now I have things shoved into the lufa files, but I'd like to pull them out into their own files, and a folder for the different implementations.

jackhumbert · 2016-11-26T20:42:30Z

I stuck the main code for this in quantum/api.c, and the Midi SysEx-specific in quantum/api/api_sysex.c - other implementations should follow the same format, and define the macro SEND_BYTES in their .h file, like this:

#define SEND_BYTES(mt, dt, b, l) send_bytes_sysex(mt, dt, b, l)

They'll need to modify build_keyboard.mk as well, with something like this:

ifeq ($(strip $(API_SYSEX_ENABLE)), yes)
OPT_DEFS += -DAPI_SYSEX_ENABLE
SRC += $(QUANTUM_DIR)/api/api_sysex.c
OPT_DEFS += -DAPI_ENABLE
SRC += $(QUANTUM_DIR)/api.c
endif

Then the rules.mk/Makefile can have API_SYSEX_ENABLE ?= yes to turn things on.

wilba · 2016-11-27T01:56:39Z

Less macros in calling code would be better. Something like api_send_bytes would have the conditional compiling to route it to sysex_send_bytes or raw_hid_send_bytes.

Conversely, depending on the macros, sysex_receive_bytes and raw_hid_send_bytes would route it to api_receive_bytes for generalized handling of the data.

FWIW there's no reason you can't compile all the API .c files, just #ifdef API_SYSEX_ENABLE the contents of api_sysex.c - keep things simpler.

jackhumbert · 2016-11-27T02:11:27Z

We've debated about the the compiling-all method (#ifdef inside c files) before (can't find the issue right now) - if we end up changing this, we'll need to do it through-out and be consistent, but for now I think we opted for the current implementation.

I think the macros are fine for now. The full code/demo is up on wu5y7 for those wanting to check it out.

fredizzimo · 2016-12-03T14:07:51Z

I haven't had time to look at this until now, but I have a few concerns.

There's only 256(or 128 if we restrict us selves to 7-bit) different data types. So if this protocol gets used for a lot of different things, we could run out of space.

User code (keyboard and keymaps) would have to their own header which defines custom data types that they can use. So it could potentially complicate their code, as it would have to look something like this

    switch (data[0]) {
        case MT_SET_DATA:
            switch (data[1]) {
                case DT_USER_ACTION:
                    switch (data[2]) {
                        case DT_USER_CUSTOM: // DT_USER_CUSTOM is something that the user has defined by himself
...

As you can see it generates a quite complex switch statement with deep
nesting.

I think the term response(resp) would be a more natural name for what you call acc. When I hear acc I mostly think of something low level that happens automatically by the communcation stack.

These issues lead me to think that we should perhaps use a single 16 bytes enumration for this and include the types of action in the enumration, so we would have somthing like this instead (does not include everything from the original example)

enum API {
    API_NONE = 0x00,
    API_HANDSHAKE,
    API_GET_DEFAULT_LAYER, // API functions that return values are prefixed with GET
    API_RESP_GET_DEFAULT_LAYER, // The response is named the same as the corresponding call, but with an extra RESP_
    API_SET_DEFAULT_LAYER,
    API_GET_CURRENT_LAYER,
    API_FLASH_LEDS, // Actions are always verbs
    // Always add new stuff after this line to avoid breaking protocol compability 
    // Always add new stuff before this line to avoid breaking existing code
    API_KEYBOARD_START = 16384,
    API_KEYMAP_START = 32768,
};

// A keypap could define their own extensions like this
enum KEYMAP_API {
    API_TYPE_HELLO_WORLD = API_KEYMAP_START,
    API_RANDOMIZE_KEYS,
};

If we halve the range, and use one bit for the response, then we could further simplify this, by not having to define the responses separately, but the handling switch statement would look like this.

if (data & 1) {
    process_api_response(data & 0xFFFE)
}
else {
    process_api_request(data)
}

This would allow everything to be written without nested switches, make it easier to define custom actions, and allow for much more action types.

I will comment about the MACRO, and #ifdef stuff in #452

jackhumbert · 2016-12-03T15:32:01Z

I have a hard time seeing the number of data types growing that big, but we could go ahead and expand it to two bytes - that should be plenty, right? We can always increment the API version and add another byte (adding a version will be useful elsewhere too). Initially I had a different data type enum for each message type, but there's enough overlap that I think having just one would suffice. It also makes it really easy to move to other codebases (might be just a .h of enums that you can include/port).
Agreed, but macros could make this a lot easier to deal with, like we did for the leader key implementation.
Ah, yeah - that makes sense.

Do you mean 16 bit enum here? I like the safe range idea that jumps ahead to places for the other uses.

I think the switch statements work pretty well here - I like the idea of &ing to figure out how to handle it, but I don't know that it makes things that much easier. I'm imagining a lot of the complicated blocks being replaced by macros for the keymaps/users.

fredizzimo · 2016-12-03T16:27:12Z

Yes of course, I mean 16 bits, not bytes.

Regarding the idea having one bit to represent the response. The idea was to have that code snippet inside the api core, and let the user define two functions, process_api_request, and process_api_response. The user would just implement those and switch for the interesting things. But I guess some macro based solution would also be possible.

wilba · 2016-12-04T23:43:08Z

I've started using the raw HID interface (see #921) for basic commands like setting keymap keycodes, etc.

A separate ID for the "response" is redundant. For example, my prototol is fairly basic, the package has a command byte then the input data. So for example, a command to "get" a keycode out of the keymap will be: command ID, layer, row, column. In the handler, I just return the same buffer with the return value at the end, e.g. command ID, layer, row, column, keycode. My point is, you can use the same command ID because there's never a case where the device wants to use that command on the host. This means the host code will be much simpler. If you want to "validate" that the returned package is for the command you sent, you can just do simple byte comparisons on bytes of the returned data.
I think the first byte should be enough to determine if the message should be handled by QMK or by the user, not three. This allows for QMK protocol to change without affecting any user's code. For example, let's say a packet which starts with 0xFF is a user-handled packet. It gets passed on to the handler in the user's code. Everything after the 0xFF is specific to the user code. If QMK wants to reorder the command IDs, refactor, whatever, the user's code is unaffected. It can return the whole packet to the host and it won't have any "QMK specific" values.
You're overthinking the protocol a little bit... grouping functionality when there's no real need. Just list the functions and give each one an ID. If you think the number of functions is going to exceed 255 then use 2 bytes, but there's really no need to have nested switch statements, it can be completely "flat", each command ID maps to a single function, each function pulls its arguments out of the packet.

So I propose: Two bytes at the start of the packet. If the first byte is 0xFF then it's a user-handled packet, which uses the second byte as a command ID. If the first byte is not 0xFF then it's a QMK handled packet. First byte + second byte becomes a 16-bit command ID. Each command ID maps to a function, like eeprom_write() or set_keymap_keycode() or set_backlight_effect().

jackhumbert · 2016-12-05T02:01:08Z

The protocol is written in a way so that not only can hosts can communicate with the device, but devices will be able to communicate with each other as well - this is the reason for the response bit.
I think the same thing is accomplished by what we're offering currently. My goal with the process_user/keyboard/quantum level stuff is the same as the process_keycode layers, ie being able to cleanly override built-in functionality at every level. @fredizzimo, I think this might apply to the response bit as well - there's not really a need since the entire process function will be passing the data through (if implemented), regardless of request/response.
I think what we're planning for requires this - we have a handful of use cases right now, but this could easily get pretty big pretty quickly. A flat switch statement is always possible, but the nested one allows some common things to be done after each message/data type that might be useful in the future. Obviously other implementations can do whatever they'd like.

wilba · 2016-12-05T03:48:04Z

In cases where the response is different, then use a new command ID. It just makes no sense to require this for all the other commands which are implicitly host to device or device to host. Again you're just using up IDs in that range.
I'm saying it doesn't have to be that way, and would be better without it. You cannot predict what users want to do, so don't try, don't make them handle some specific "QMK custom blah" packet, give them the raw data if the the packet doesn't start with the "user specific" byte. Problem solved. Do whatever you like with the other 254 starting bytes and the rest of the packet. This basically lets end users do things their own way, maximize use of the rest of the packet.
You can have it both ways... switch/case with multiple case labels can group and handle a range of command IDs. When you start splitting things up into groups, reserving values for things, you end up limiting things for no reason, and making it way more complex than it needs to be. There's no reason for this to have nested switch statements. Route all packets with a command ID in a group to a single handler function, passing in the entire packet. Let that function do more parsing of the data.

I am advocating that rather than imagining all the possible types of things you want to do and putting them into groups and labels, you pin down some concrete functions that you want to implement. Define the functions you want, as C functions with arguments. Make them atomic. Use small amounts of input/output arguments. For example, a function that sets one parameter, not an entire "config" struct. For example, in Zeal60, the keymap can be changed by the host, and it does it one keycode at a time, because a) the packet size is too small to send a whole keymap, but also b) there's no time penalty in sending things one keycode at a time, the user doesn't care if it takes 1ms or 1000ms.

If you start thinking about it as RPC (remote procedure call), not an implementation of some low-level network protocol or an asynchronous messaging system, it keeps things in perspective.

From the host's point of view, it should only need to include one enum, i.e. one set of command IDs, not have a complex nested switch statement to parse the packets coming from the host. Both sides can be handled by a single switch which routes single command IDs or command ID ranges to handler functions that do the work.

You can see how I've done this in the Zeal60 implementation which I will pull request soon... I use raw HID for changing keymaps and the backlight.

wilba · 2016-12-13T02:55:27Z

My implementation of using raw HID (for setting keymaps, backlight settings in EEPROM) is here:

https://github.com/Wilba6582/qmk_firmware/tree/zeal60

I've done what I wrote about in the previous comment. The first byte in the packet is the "function" identifier, and the rest of the data are the arguments to that "function".

fredizzimo · 2016-12-27T20:55:52Z

Sorry for the long response, I think this description is more complex than the actual implementation would look like.

I have been thinking about the protocol a bit more, since I'm planning to integrate it to the visualizer, and also the emulator that I'm currently working on.

I don't think an asynchronous protocol, like the one described in the original post is the way to go, as it makes things very hard to use. While the usage seems reasonable simple in the qmk helper application, in the general case it isn't. Say we want to increase, or decrease the backlight level, on the host side we would need to do something like this.

int state;
int backlight_inc;


void increase_backlight(int inc) {
   state = STATE_CHANGING_BACKLIGHT;
   backlight_inc = inc;
   MT_GET_DATA(DT_BACKLIGHT, NULL, 0);
}

void process_api(uint16_t length, uint8_t * data)  {
    switch (data[0]) {
        case MT_GET_DATA_ACC:
            switch (data[1]) {
                case DT_BACKLIGHT:
                    uint8_t backlight = data[2];
                    swtich(state) {
                        case STATE_CHANGING_BACKLIGHT:
                            backlight += backlight_inc; 
                            MT_SET_DATA(DT_BACKLIGHT, &backlight, 1);
                            break;
                        case STATE_ANOTHER_THING_THAT_FIRST_NEED_TO_READ_THE_BACKLIGHT:
                            break;
                    }
                break;
            }
        }
    }
}

As you can see, even the most simple things would need some quite complex state machine and saving of variables for later use. The reason why the QMK helper application can get away with it, is because it requests the whole keyboard state at the beginning and assumes that the keyboard never changes it by itself, so it just send the change based on the local values that it has saved. However that's not a general solution and it only works for the most simple use cases. A little bit better would be to periodically refresh the state from the keyboard, but that requires sending a lot of unnecessary data, and depending on the refresh rate, there are still a lot of potential for things to go wrong.

In high level languages like C# or Python, the solution could still be manageable, since things could be wrapped in async/await or lambdas to make it easier to use. But if we want to use the API for sending commands to the host or from keyboard to keyboard it will be complicated. It will also require more code, and memory for variables related to state handling, which both are very bad for the devices with a tiny amount of memory like we are dealing with.

Additionally, I think the above example switch statement serves as yet on additional argument for how difficult the proposed protocol is to work with, so my proposal below will just have a single enum with ranges, just like I and @Wilba6582 already commented on previously.

I'm proposing a simple synchronous request/response protocol and I think @Wilba6582 is looking for the same with his comments related to RPC. But both his Zeal keyboard and the raw hid API is still asynchronous in it's nature. Note that there's a difference between synchronous/asynchronous on the protocol level and blocking/non-blocking on the application level. Here I'm talking about being synchronous on the protocol level, which means that a request will always be followed by a response, after which another request can be sent again. An asynchronous protocol would allow us to send multiple requests in parallel and receive the responses at any time, possible in different order. As an example by this definition, http is synchronous while the underlying transport protocol TCP is asynchronous. Non-blocking simply means that the API call will return immediately, while a blocking call waits for the complete operation until it returns.

Here I'm also proposing a blocking API, since a non-blocking one would cause a lot of the same problems as an asynchronous protocol has, but if needed we could quite easily allow requests without responses to be non-blocking at the cost of some extra memory needed. However since the latencies we are dealing with are so small, with the worst case probably around 2ms (USB limit) for the raw hid, I'm quite sure that a blocking API will be enough. Other communication channels will be much faster.

I'm also adding target addresses to the API, since we have to deal with different targets, both multiple different types of host applications, and slave keyboards/devices. The addressing scheme is the following

0 - The master keyboard
1 - 64 The slave keyboards, the keyboard could either use a daisy chain setup, like the Infinity Ergodox, or assign the numbers as it pleases.
65 - 254 Address of the host application. We assign these numbers as different applications are made, this way we can communicate with several different applications at the same time.
255 - All slave keyboards. Special case to broadcast a command to all slaves. Only empty responses are supported.

Each address can have a different underlying transport protocol or and physical communication medium. So we need runtime polymorphism, but this can easily be done in a similar way as we do for host_device_t.

We also need to be able to connect to a specific address, and to be able to determine if we are connected or not. Finally we need some kind of callback that is called once a connection is performed, so that we can send some initial state to the other part.

So let's put everything together, starting with the base packet format.

typedef struct {
 uint16_t request_id : 15;
 uint16_t is_response : 1; // Could be bool, but uint16_t makes it compatible with MS Visual C++
 uint8_t data[];
} packet_format_t;

We need to use 1 bit for the response, as the same physical communication channel might be used for both incoming requests and responses, and they can be interleaved. But it still leaves us with 32768 possible values.

In the QMK core code we define, and enumeration with the following structure

enum API_REQUESTS {
    API_QMK_BEGIN,
    API_QMK_GET_BACKLIGHT, 
    API_QMK_SET_BACKLIGHT,
    ...
    API_QMK_END = 8191,
    API_KEYBOARD_BEGIN = 8192,
    API_KEYBOARD_END = 16383,
    API_KEYMAP_BEGIN = 16384,
    API_KEYMAP_END = 24575,
}

Note that we still have 8192 unassigned values, that could be used for something else. The ranges are also not completely set in stone, but changing them would need recompilation and possibly changes to the source code of all the applications that are using the API.

This enum is easy to process in the core of QMK.

void process_api_request_qmk(uint8_t target, uint16_t request_id, void* data) {
    switch(request_id) {
        case API_QMK_GET_BACKLIGHT:
            api_get_backlight_response_t resp;
            resp.backlight = get_backlight();
            api_send_response(target, request_id, resp, sizeof(resp));
            return;
        case API_QMK_SET_BACKLIGHT:
            api_set_backlight_request_t* req = (api_set_backlight_request_t*)(data);
            set_backlight(req.backlight);
            // The code internally detects that no response is sent, so it will send an empty one
            return;
    }
}

__attribute__ ((weak))
void process_api_request_keyboard(uint8_t target, uint16_t request_id, void* data) {
}

__attribute__ ((weak))
void process_api_request_keymap(uint8_t target, uint16_t request_id, void* data) {
}
   
void process_api_request(uint8_t target, uint16_t request_id, void* data) {
    switch(request_id) {
        case API_QMK_BEGIN ... API_QMK_END:
            process_api_request_qmk(target, request_id, data);
            break;
        case API_KEYBOARD_BEGIN ... API_KEYBOARD_END:
            process_api_request_keyboard(target, request_id, data);
            break;
        case API_KEYMAP_BEGIN ... API_KEYMAP_END:
            process_api_request_keymap(target, request_id, data);
            break;
    }
}

You might have noticed the api_get_backlight_response_t and api_set_backlight_request_t, we define these types of structures for all requests, in api_requests.h and api_responses.h respectively. Not all requests need both of them, so it perfectly valid to leave either of them undefined. There's also no direct connection between the request and response data format, both of them can be completely different. However, we should always use this naming convention, so it's easy to go from the request id to the struct definition. Furthermore, the structures should be packed and aligned for easier integration with external programs.

typedef __attribute__((packed, aligned(4))) {
    uint8_t backlight;
} api_backlight_response_t

The alignment is not strictly needed, but in some cases it will generate better code on the ARM processors. Also note that the internal endianess is little endian on all the AVR and ARM processors that we support. Since normal PC's also are little endian this makes things easy to work with. If we ever want to support a big endian micro controller, we would have to rethink. One option would be to switch to C++ for that and use wrapper classes, which convert the endianess on set. But I really doubt there's any need for that not now, and not in the future. Especially as most modern architectures actually are dual endian.

There are two reasons why I prefer this struct format rather than some manual serialization, like the current API implementation and the Zeal implementation; the generated code will be smaller and it's also much easier to see what kind of request/response format the API uses.

I also recommend that if we change the data format, we also make a new version of the request, so if the backlight takes another parameter, we add another request API_SET_BACKLIGHT_V2, with it's own structures. We should also never delete any previously existing definitions, instead we should print a warning and disconnect if we receive an unhandled request.

Keyboards define their own similar enums, but in api_keyboard.h, api_keyboard_requests.h and api_keyboard_responses.h. Keymaps do the same but keyboard is replaced by keymap. The enum looks like this

enum API_KEYBOARD_REQUSTS {
    API_KEYBOARD_DO_SOMETHING = API_KEYBOARD_BEGIN,
    API_KEYBOARD_DO_SOMETHING_ELSE,
    ...
}

You already saw a glimpse of the public API for the API usage, but here it is in it's full form.

typedef void (*api_disconnection_callback_t)(int target);  

typedef struct {
void* (*send_request)(uint8_t target, request_id, void* data, uint8_t datasize);
void (*send_response)(uint8_t target, request_id, void* data, uint8_t datasize);
bool (*connect)(uint8_t target);
bool (is_connected)(uint8_t target);
} api_driver_t;

void* api_send_request(uint8_t target, uint16_t request_id, void* data, uint8_t datasize);
void api_send_response(uint8_t target, uint16_t request_id, void* data, uint8_t datasize);
bool api_connect(uint8_t target); // returns true if the connection is successful
bool api_is_connected();
void api_add_disconnection_callback(api_connection_callback_t cb);

void add_target(uint8_t target_mask, api_driver_t* driver);

Note that the data size is limited to 256 bytes because allowing bigger sizes would use too much memory anyway. And I actually think we have to limit it to 64 bytes, as that's the maximum amount we can send in one USB HID packet.

The target is one of the numerical addresses, which I described above.

Note that both api_send_request and api_send_response should be no-ops on disconnected targets.

If you call api_send_request when a response is required, the connection will disconnect, and the same thing vice versa.

The api_send_request will return the pointer to the response, which can be cast to the right type. If the connection is disconnected or was disconnected a null pointer will be returned. When there's no actual response, the pointer will still be valid, but point to random data. We could also possibly add some setjmp, longjmp combination for easier use when you want to perform multiple calls in succession and don't have to check the result of each of them.

Note you can call connect even if the target already is connected

The add_target function should only be called when the keyboard is initialized.

The driver API functions should behave as if the connection is reliable, so if the underlying connection is unreliable, it should automatically retry until it's determined that there's a disconnection.

jackhumbert · 2016-12-27T21:30:10Z

@fredizzimo thanks for the full write-up on this! It all makes sense and sounds good :) I like the typing of the responses and requests as well. I suppose after hearing from @Wilba6582 on this, we could close this issue and make a new one for the implementation of the final design.

skullydazed · 2016-12-27T21:59:06Z

I like this overall. The choice of a synchronous protocol will help to keep things conceptually simpler, making it easier for outsiders to dive in. That ticks the boxes for several of my goals.

65 - 254 Address of the host application. We assign these numbers as different applications are made, this way we can communicate with several different applications at the same time.

Are we concerned about app proliferation here? What happens when we have 189 apps assigned and someone wants to create a new app? Do we have to start tracking down defunct apps?

255 - All slave keyboards. Special case to broadcast a command to all slaves. Only empty responses are supported.

Is a response required? IE, could I use the lack of a response as a signal that a particular slave did not execute a "group command"?

fredizzimo · 2016-12-27T22:24:16Z

Are we concerned about app proliferation here? What happens when we have 189 apps assigned and someone wants to create a new app? Do we have to start tracking down defunct apps?

I guess we could divide it into three parts. QMK, keyboard and keymap specific apps, with let's say 16 keyboard and keymap specific ones, and the rest assigned to QMK. That way I'm pretty sure we never reach the limit, although I'm relatively sure that we won't in any other case either.

Is a response required? IE, could I use the lack of a response as a signal that a particular slave did not execute a "group command"?

This probably needs more re-thinking. But I think that if the slave doesn't respond, it should be automatically disconnected. Otherwise the command should always have executed.

The response that the host gets back could be null for "a slave disconnected, but some slaves might have executed the command" and a valid pointer if all the slaves executed it. You can always use the is_disconnected check to check which slaves have disconnected.

Technically this would be by far the hardest thing to implement, especially with daisy chaining, where the command can pass through any number of intermediate slaves, so a disconnect in the middle will disconnect the rest of the link. The serial link protocol do have partial support for it though, but that doesn't know anything about disconnection yet, especially not for slaves.

I guess another alternative design for broadcast, which would allow responses, would be to return an array of pointers instead of a single pointer. But that would mean that we need memory for all the responses, so it's probably out of question for the smaller AVR chips.

skullydazed · 2016-12-27T23:01:32Z

I guess we could divide it into three parts. QMK, keyboard and keymap specific apps, with let's say 16 keyboard and keymap specific ones, and the rest assigned to QMK. That way I'm pretty sure we never reach the limit, although I'm relatively sure that we won't in any other case either.

I think the limits as you've initially proposed are a good start, I primarily want to make sure we understand what we do if we start approaching these limits. If dedicating 16 addresses to keyboards and keymaps will make that easier to avoid I'm all for it. If it's adding complexity to address a limit we don't think we'll hit I don't see a lot of reason to do it.

This probably needs more re-thinking. But I think that if the slave doesn't respond, it should be automatically disconnected. Otherwise the command should always have executed.

I think understand your reasoning here. This will preclude some use cases for a broadcast address, but I think if someone really needs to receive a response from all slaves they could send_request(255, ...) and then each slave could, once all the responses have been sent, send_request(0, ...) with a command that the master in turn sends an empty response to?

fredizzimo · 2016-12-27T23:09:27Z

@skullydazed, I agree with not complicating the address ranges until we start seeing problems.

And that would be one way of working around the no-response limit. Another would be to simply not use broadcast, and just loop through all destinations and contact them each in turn. That would probably be relatively slow though.

wilba · 2016-12-28T01:52:06Z

@fredizzimo To clarify, the protocol I'm using in my Zeal60 implementation is synchronous and blocking, even though the raw HID can be used asynchronously. I agree that this is easier than defining an asynchronous protocol. I can't see any reason why requests from the host can't be responded to immediately, or why the host would be too busy to wait for a response.

Manual serialization is important, even if it does use extra code, forcing big-endian in the protocol means it's independent of the CPU's (and host's) endian-ness. Also, defining a struct for every message leads to code bloat, when most of the time, you're probably only pulling out a few ints from the packet and routing it to some set functions.

I'm not really seeing the point of the addressing scheme, or even the use case where multiple apps will be simultaneously connected to the keyboard.

On the topic of versioning the protocol, if you need to change the protocol, change it in place and bump the version number of the protocol, hosts can query this, compare to their own version number, and report they are out of date. As long as the method to query the protocol version is unchanged, this will be "forward-compatible" and old apps will not even try to work with newer firmware.

My advice is to stop trying to predict what anyone would ever want to do with this protocol and just go ahead and use what's available now (i.e. the raw HID) in a keyboard's own code and identifying actual uses, functions, etc. you need to get the job done. This is the beauty of forks and branches. Make a branch and have a play, show what you've done, copy what others do, etc.

For example, the "keymaps in EEPROM" functionality I added to Zeal60 is probably something others may want to integrate into their own keyboard firmware. They can patch it in at the same level, experiment, collaborate with me on a GUI, etc. and then when it's at a fully functional, stable level, refactor it into the generalized QMK protocol. That same workflow can apply to any other kind of functionality. Sort out the actual functionality required in "custom" uses of raw HID and then discuss/refactor later.

fredizzimo · 2016-12-28T10:20:42Z

@Wilba6582,

I might have missed something, but I would still classify your protocol as asynchronous but blocking. The reason for this is that you don't receive the response when you call raw_hid_send, instead you get it at a later time, when raw_hid_receive is automatically called. Of course your current implementation is synchronous at least from the keyboards perspective, since you don't have any keyboard to host commands.

I also agree that it should be possible to have manual serialization, and nothing prevents you from doing that with my proposed model either. Just define a message with a uint8_t buffer, and manually serialize to that. Actually you wouldn't even need to typedef the struct, you could just pass uint8_t arrays to the send functions and cast the return value to uint8_t.

What I don't agree on, is to define big endian as the wire protocol and to force manual serialization everywhere where much more simple casting is enough. All the platforms we are dealing with are little endian, unless someone is working with old PowerPC Mac hosts, or even more exotic platforms. So let's define the wire protocol as little endian, and casting struct is therefore a valid way of doing things. Big endian platforms could do the manual serialization, if we ever need to support them.

Also defining structs doesn't cause any code bloat. The code is easier to read than manual serialization, since we are just setting and reading named fields, instead of having to interpret something like this

uint16_t alpha_mods[5];
alpha_mods[0] = data[1] << 8 | data[2];
alpha_mods[1] = data[3] << 8 | data[4];
alpha_mods[2] = data[5] << 8 | data[6];
alpha_mods[3] = data[7] << 8 | data[8];
alpha_mods[4] = data[9] << 8 | data[10];
backlight_config_set_alphas_mods( alpha_mods );

we have

// Defined separately
typedef __attribute__((packed, aligned(4))) {
    uint16_t alpha_mods[5];
} api_backlight_config_set_alphas_mods_req_t;

// The same as the above code snippet
api_backlight_config_set_alphas_mods_request_t* r = 
    (api_backlight_config_set_alphas_mods_request_t*)(data);
backlight_config_set_alphas_mods( r->alpha_mods );

The length of api_backlight_config_set_alphas_mods_request_t could be debated. maybe request could be shortened to req, api could potentially be dropped, and req added in front instead. _t isn't really needed.

So the request id would be API_BACKLIGHT_CONFIG_SET_ALPHAS_MODS
The request type req_backlight_config_set_alphas_mods
The potential response type res_backlight_config_set_alphas_mods

I already have real use cases for the addressing. The Infinity Ergodox is a split keyboard, and I'm going to make an emulator, which can be used for debugging and testing things without having to flash the keyboard. So I have the following targets

Any slave should be able to send their physical keyboard states to the master
The master should be able to broadcast what needs to be shown on the LCDs and LEDs to all slaves
The master needs to send the physical key states to the emulator
The emulator needs to send the USB keyboard report back to the keyboard, which then will send it as a real keyboard press to the host
The emulator needs to contact the other emulated half, in the same way the real physical keyboard would. This would probably use a TCP transport for the API.

I don't really see the need for addressing multiple applications running on the host at the moment. But as you said, let's not predict what the users want to do, so let's not restrict the protocol to just one host application. I also don't necessarily see the need for the keyboard to even open a connection to the host, but let's not restrict that either by having a too restrictive API.

For versioning, you are probably right. I wanted to have versioning on the message level, to avoid the situation where completely unrelated changes breaks your application. But handling it globally is more simple, and it also handles logical protocol changes, which would be unnatural to do with message based versioning.

However we need two version numbers

Current version. Always incremented on every API change.
Compatible version. Only incremented when existing messages are changes. Or when there are changes to the existing protocol logic.

A connection is only accepted if the client version is greater or equal to the compatible version and less than or equal to the current version. Correct usage of the version numbers need to be strictly enforced by code reviews.

Furthermore you need to be able to request a lower version for outgoing connections. Otherwise the keyboard will stop being compatible with existing applications as soon as something change. If you try to request a lower version than your compatible version, then it should internally request for the compatible version. This way, we can just leave the connection call as it is, even if the compatible version changes, and it won't connect to old applications with incompatible version numbers. A connection attempt with a version number greater than the current version should automatically fail, as that's a programmer error.

wilba · 2016-12-28T12:43:56Z

@fredizzimo I'm choosing not to assume (and rely) on the architecture or compiler's choice of endian-ness. It's the same deal on the host side - I can't assume the host code is the same endian-ness, or can even do magic casting tricks like C/C++, so if the host has to do explicit serialization, then at least do it big-endian (like how we work with ints even if it's not how they're actually stored in memory).

I will concede though that moving that serialization to/from a struct would be good, if only to get it out of the protocol handler, which would be a big switch statement... if this was C++ I would have classes for every "command" that would have a read(uint8_t *data) method which could load its state from a packet, and stick that serialization in there. Ideally, I could have those classes pulled into the host code. I sort of do some of this by sharing the enums for the command IDs, but it would be nice to have common code that does both. But this isn't C++. I suppose I could fake it with structs and functions. Maybe my next commit will have something like this, but I'm unlikely to budge on the big-endian ;-)

My point about the versioning was to keep it ultra simple, not bother with "compatible version" stuff. Consider that users of this protocol are going to keep both in sync anyway, I don't see the point in spending too much effort in ensuring that code not built from the same "version" of the interface works together. If you're envisioning some totally generalized QMK app that is supposed to work with a bunch of keyboards all of different protocol versions, then that's a use case I think extremely unlikely.

jackhumbert · 2017-01-05T17:32:16Z

Would a timeout and request priority be useful here? Requests that are more important would have a longer timeout, whereas less important ones would fail quicker. That would insure that at least one gets a response if they're done at the same time, then the lower priority one could retry if needed.

skullydazed · 2017-01-05T22:23:12Z

If you use a timeout mechanism you should add a random wait to each timeout. That way if two requests of the same priority get into a deadlock situation one of them will timeout first, and you won't have the situation of both timing out and retrying at the same time.

jackhumbert · 2017-01-05T22:28:19Z

Yeah! I thought about the random timeout, but didn't think about both devices having the same request - nice :)

When connecting, we could allow each device to declare/request a timeout (and therefore a master/slave-type relationship), to ensure they're different/far enough apart to be useful.

fredizzimo · 2017-01-05T22:58:49Z

Currently the algorithm I have is this, only for connection requests currently, but I will soon add it for everything.

send request
get next packet from the destination
while next packet is a request
    handle it
    send the response
    get next packet from the destination
// next packet is now a response
return the response to the caller

The idea is to always handling incoming requests first to free up the other part. So this won't deadlock, as long as the other part isn't continuously sending requests.

The only bad thing is that we need to be prepared to handle a request (from the endpoint your are sending to) during any send operation. But I think we can live with that limitation as the biggest problem would be when we want to return some data that is invalid part of the scan loop, but I can't think of many such cases. And you can always copy the data, or move the send function to another place if that's the problem. Other endpoints stay unaffected, since there's no need to handle incoming requests from them

Timeouts would complicate things a lot on the protocol level, and I don't think we could resolve the situation with timeouts.,

If we totally fail the send, then the sender would either manually have to retry it and cause another deadlock, since it still can't handle the incoming request, or continue with the keyboard loop and retry at a later point. That in turn would cause a lot of extra logic in the application code, which is something that we probably want to avoid.

I will also add an async_send function. It will still have guaranteed delivery, unless there's a disconnect, and it happens in the background, so no need to wait. These are also ordered so if you do multiple async_send and the receiver will receive them in order. If you do a normal send, it has to wait until all the async_sends are delivered until it will send and receive the response, therefore the order between async_send and send is also maintained.

async_send can block if the protocol buffers become full. But otherwise the call shouldn't do much more than a simple memcpy.

You won't get a response from an async_send, but if you need to, you can either do another async_send from the receiver side. Or let the sender send a normal send a bit later to receive the response.

These are mostly useful for sending constant data to the other part, for example a key logger. But also for split keyboard communication, where we don't want to block the scan loop too much. A normal send over the USB hid could take several milliseconds, since we are only allowed to send at regular 1ms intervals, so both the sender and receiver has to send at the right time.

fredizzimo · 2017-01-08T21:02:49Z

I need some advice on the naming, as I don't think api_async_send and api_send are good names.

I described api_async_send above, but basically it just sends something to the other part without caring about the response or waiting for a result.

api_send on the other hand sends a request and receives the result and returns it to the caller before continuing.

So I have been thinking about the following
api_send_and_receive vs api_just_send
or
api_send_and_receive vs api_send

But I'm not entirely happy with the above names, so I'm wondering if you have better suggestions.

On the receiver side, I have
api_handle_and_respond and api_handle respectively, but I'm not sure about those either.

iFreilicht · 2017-06-27T10:17:24Z

@fredizzimo I believe those names should be seen as a whole, they have to make sense together. I do like how descriptive the ones are you already have, but "handle" is a bit too ambiguous, as handling a a message can also include responding to it.

Let's think this through; either a message is sent from the sender and received by the receiver, or it is sent from the sender, received and responded to by the receiver and then awaited by the sender. The two functions act as logical pairs.

So a potential set of candidates would be api_send, api_send_and_await_response, api_receive, api_receive_and_respond.

According to this SE Answer, reply might actually a better word here.
Additionally, if we talk about a receiver, we could call the matching entity "transmitter" as it is done in serial protocols.

So, an alternative set would be api_transmit, api_transmit_and_await_reply, api_receive, api_receive_and_reply. These sound pretty good, I think. Very descriptive, no ambiguity.

mavanmanen · 2017-12-28T09:04:18Z

And updates on potential implementation of this feature?

algernon · 2017-12-28T09:24:32Z

I haven't read through the whole thread here, so read my words with that in mind. In Kaleidoscope, I created a bidirectional communication plugin, dubbed Focus that works over Serial. It is a text based protocol, with no real spec, that relies more on convention. It is strictly serial, as in, you can't have overlapping commands on the wire.

The implementation of this is pretty simple. Not the most efficient by far, and there's a lot of data travelling over the wire, so it's not going to be fast. But the assumption is that this doesn't matter, because all of these happen rarely enough to not be a problem.

The README (in the repo linked above) has a few examples of the protocol itself.

VanLaser · 2018-09-14T18:18:53Z

My implementation of using raw HID (for setting keymaps, backlight settings in EEPROM) is here:

https://github.com/Wilba6582/qmk_firmware/tree/zeal60

I've done what I wrote about in the previous comment. The first byte in the packet is the "function" identifier, and the rest of the data are the arguments to that "function".

@Wilba6582 - first, great work in implementing something practical/concrete! 👍 second: could you point me to the host-related API (or commands) that send data to the keyboard? Are you using something custom based on PJRC - USB: Raw HID? And third: do I need your branch, or could I use the official QMK firmware code? (I see 'raw_hid_send' there)

wilba · 2018-09-15T01:02:22Z

@VanLaser this is now merged into QMK master, in keyboards/zeal60. The host code that works with it is here: https://github.com/Wilba6582/zeal60

I haven't really documented the protocol, but it's basically using the first byte as a command ID that is handled by the "keyboard", thus it matches a per-keyboard set of command IDs. This lets the keyboard code route the command to a module. The next byte can then be a module-specific command ID, and subsequent bytes can be arguments to that command. To keep things simple and less code, I also have commands that set/get parameters using a parameter ID. This was previously done via passing the raw bytes of a struct, i.e. both sides would compile the same structs (in C/C++), but this requires more definitions to be shared between host and firmware and complicates usage in JS. Right now, the whole API definition (i.e. it's magic numbers, etc.) can be defined just with enums for command IDs, parameter IDs, parameter values, etc.

VanLaser · 2018-09-15T19:55:22Z

@Wilba6582 Thanks for the link and the info, much appreciated!

VanLaser · 2018-09-18T20:11:02Z

BTW one cool idea IMO of host<->kbd communication would be for a PC application to ask the keyboard for the key mappings in layer N or other related info, so it could for example show a semi-transparent, temporary, on screen display at the user's request with: the key mappings, or what the current layer is (e.g. when one toggled to layer N), or if a leader key sequence is in progress, macro info etc.

I.e. ask the keyboard how it was programmed and display that info, since the user could have forgotten it.

pvinis · 2018-09-19T02:21:54Z

I've had the exact same idea for a while. I was thinking of even layers, so when I hold a layer changing key, the keymap shown reflects that.
I actually started with a small app and I was trying to get all that info from the actual keymap, which is the source. but it's not do easy.

drashna · 2018-10-21T16:28:22Z

Wilba added RAW HID support when merging in the Zeal boards. IIRC.

We could expand upon that and add documentation for this.

robbiet480 · 2020-08-06T22:48:52Z

Has anyone made progress on this? I am in the midst of implementing my own protocol via HID to talk with a companion app for macOS but obviously would prefer to not re-invent the wheel.

tzarc · 2020-08-07T00:45:45Z

The via protocol is already present and usable... they've currently got a closed-source app, but given that the API is available over RAW HID, anybody can create an implementation that leverages the same API.

iFreilicht · 2020-08-16T12:35:09Z

@robbiet480 There is a virtual serial port available. Unfortunately, it is not documented, but it's very easy to enable, see #9131.

robbiet480 · 2020-08-18T22:22:14Z

@iFreilicht thanks for letting me know, but probably gonna stick with HID since its already implemented. May need to consider users with VIA enabled though since they already use HID.

wilba · 2020-08-19T01:46:46Z

@robbiet480 VIA will only attempt to communicate with devices that have known VID/PID so if you use one that it does not know about then it will ignore any custom usage of raw HID. If you want to do both (VIA and your own custom communication), then that is also possible - just handle your own commands.

github-actions · 2022-06-16T21:16:40Z

This issue has been automatically marked as stale because it has not had activity in the last 90 days. It will be closed in the next 30 days unless it is tagged properly or other activity occurs.
For maintainers: Please label with bug, in progress, on hold, discussion or to do to prevent the issue from being re-flagged.

PeterHindes · 2023-01-09T08:51:24Z

This is certainly an interesting proposal. I think json over serial might be a more flexible communication protocol between qmk boards and a host application. On top of that a standard could be defined for the json messages which multiple vendors keyboards could implement if wanted, or one off host application + custom qmk functions could be defined. I will have to look into how VIA support is currently implemented, but I think this would likely be a separate protocol to keep support for VIA on boards that use custom callbacks for custom host applications in keymap.c.

fredizzimo mentioned this issue Dec 3, 2016

Placing #ifdefs inside or outside of source files #452

Closed

skullydazed mentioned this issue Dec 4, 2016

RFI(nput): RGB Backlight Support #934

Closed

fredizzimo mentioned this issue Jan 5, 2017

Initial whitefox support #987

Merged

fredizzimo mentioned this issue Mar 26, 2017

Effect System Proposal #717

Closed

fredizzimo mentioned this issue May 1, 2017

visualizer and keycodes to operating system #1270

Closed

skullydazed mentioned this issue Jun 27, 2017

HID Send? #747

Closed

drashna added enhancement discussion labels Oct 21, 2018

h-youhei mentioned this issue Mar 5, 2019

kana layer h-youhei/qmk_keymap#3

Closed

github-actions bot added the stale Issues or pull requests that have become inactive without resolution. label Jun 16, 2022

tzarc closed this as not planned Won't fix, can't repro, duplicate, stale Jun 16, 2022

QMK protocol for interacting with other systems #895

QMK protocol for interacting with other systems #895

Comments

jackhumbert commented Nov 22, 2016 • edited Loading

Getting data from the keyboard

Setting data on the keyboard

Sending data/action from the keyboard

Executing actions on the keyboard

wilba commented Nov 22, 2016

wilba commented Nov 22, 2016

jackhumbert commented Nov 22, 2016

jackhumbert commented Nov 23, 2016

PureSpider commented Nov 24, 2016

wilba commented Nov 24, 2016

jackhumbert commented Nov 24, 2016

PureSpider commented Nov 24, 2016

jackhumbert commented Nov 26, 2016

jackhumbert commented Nov 26, 2016

wilba commented Nov 27, 2016

jackhumbert commented Nov 27, 2016

fredizzimo commented Dec 3, 2016

jackhumbert commented Dec 3, 2016

fredizzimo commented Dec 3, 2016

wilba commented Dec 4, 2016 • edited Loading

jackhumbert commented Dec 5, 2016

wilba commented Dec 5, 2016

wilba commented Dec 13, 2016

fredizzimo commented Dec 27, 2016

jackhumbert commented Dec 27, 2016

skullydazed commented Dec 27, 2016

fredizzimo commented Dec 27, 2016

skullydazed commented Dec 27, 2016

fredizzimo commented Dec 27, 2016

wilba commented Dec 28, 2016

fredizzimo commented Dec 28, 2016

wilba commented Dec 28, 2016

jackhumbert commented Jan 5, 2017

skullydazed commented Jan 5, 2017

jackhumbert commented Jan 5, 2017

fredizzimo commented Jan 5, 2017

fredizzimo commented Jan 8, 2017

iFreilicht commented Jun 27, 2017 • edited Loading

mavanmanen commented Dec 28, 2017

algernon commented Dec 28, 2017

VanLaser commented Sep 14, 2018

wilba commented Sep 15, 2018

VanLaser commented Sep 15, 2018

VanLaser commented Sep 18, 2018 • edited Loading

pvinis commented Sep 19, 2018

drashna commented Oct 21, 2018

robbiet480 commented Aug 6, 2020 • edited Loading

tzarc commented Aug 7, 2020

iFreilicht commented Aug 16, 2020

robbiet480 commented Aug 18, 2020

wilba commented Aug 19, 2020

github-actions bot commented Jun 16, 2022

PeterHindes commented Jan 9, 2023

jackhumbert commented Nov 22, 2016 •

edited

Loading

wilba commented Dec 4, 2016 •

edited

Loading

iFreilicht commented Jun 27, 2017 •

edited

Loading

VanLaser commented Sep 18, 2018 •

edited

Loading

robbiet480 commented Aug 6, 2020 •

edited

Loading