Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #856 - Add debug task #859

Closed
wants to merge 2 commits into from
Closed

Conversation

fjtirado
Copy link
Collaborator

@fjtirado fjtirado commented May 27, 2024

Many thanks for submitting your Pull Request ❤️!

Please specify parts of this PR update:

  • Specification
  • Schema
  • Examples
  • Extensions
  • Roadmap
  • Use Cases
  • Community
  • TCK
  • Other

Discussion or Issue link:

Fixes #856

What this PR does / why we need it:

Add debug task

Special notes for reviewers:

Additional information:

@ricardozanini ricardozanini changed the title [Fix_#856] Add debug task Fix #856 - Add debug task May 27, 2024
@ricardozanini ricardozanini added change: feature New feature or request. Impacts in a minor version change area: spec Changes in the Specification labels May 27, 2024
ctk/features/debug.feature Outdated Show resolved Hide resolved
@@ -28,6 +28,7 @@
- [Set](#set)
- [Try](#try)
- [Wait](#wait)
- [Debug](#debug)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm honestly wondering if this is a task or a call.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically you will call debug within a composite task before and after invoking another task. so I think is easier if debug is a task itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would probably be even better to call it via an extension.

Copy link
Member

@cdavernas cdavernas May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Jb, imho it should not be a task nor a call.

Reasons are:

  1. It's irrelevant to authors of the workflow (but might be to implementers): there is no effect on the workflow
  2. There is no way to unify behavior accross runtimes (logs will have different format and features. Runtimes might enforce structured logs, too)
  3. You can achieve the same using shell+echo or, even better, outputting logs to a specific file
  4. Runtime specific: aside from 2, runtimes (ex: Synapse) might not want to offer the user the ability to write to logs. If It's up to the runtimes to get logs from Somewhere and do smtg with it, it does not belong to the DSL imo.

Copy link
Member

@cdavernas cdavernas May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a side note, the use case you described to @ricardozanini is covered by the sample logging extension (see dsl-reference.md)

Copy link
Member

@cdavernas cdavernas May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bottom line is that I would instead recommend that "log" task to become the first one in the functions gallery. So instead of "forcing" runtimes to implement it, we create a 'run' task which uses shell to echo or output a potentially formatted message.
Wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JBBianchi I want to avoid different runtimes implementing logging in different ways.
In your extension example, you are invoking an URI to collect logging, but this task is intended to be able to call the runtime library (a logging library in Java, fmt package in go, Ilogger in C#) that actually logs in the console without losing portability (if any runtime implemenation support logging, the definition file is fully portable)

Copy link
Collaborator Author

@fjtirado fjtirado May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cdavernas
Sample logging extension is an example of how to collect logs, it does not cover the case where you want to log a message to the standard console. I think using extension mechanism is superb if we have the log task, but without the log task, how do users logs to the console?
I think forcing people to write logs though a call to a shell (which actually assume the workflow is running with access to a shell, which is not true if we are using windows SO) is worse that forcing runtime to implement a log task (besides the performance difference of calling to the shell to write a message compared with the performance of a system call to print in the standard output)
As a side note, as I mentioned in the issue, this is not my personal preference, I actually have real feedback from users ot the current spec. They are using custom sysout function (sonataflow) a lot. This makes workflows not portable. Im trying to avoid losing portability and losing functionality (I cannot tell that users that they cannot log to the console any longer)

Copy link
Collaborator Author

@fjtirado fjtirado May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2. There is no way to unify behavior accross runtimes (logs will have different format and features. Runtimes might enforce structured logs, too)

Any programming language which is used to implement the spec will have means to write to standar output (if there is not standard output, nothing is written), so implementation is trivial,
The message is just the string, the format is up to the implementation (which is fine as far as the message is part of the log). Implementations might provide extra configuration to format the log message (up to the implementation) The point is that the workflow definition will still be portable and when executed in any implementation it will be provide some kind of feedback though the standard console (if available). Thats all.

Signed-off-by: Francisco Javier Tirado Sarti <[email protected]>
Signed-off-by: Francisco Javier Tirado Sarti <[email protected]>
@JBBianchi
Copy link
Member

Just my 2 cents but I'd rather go for log rather than debug. In my mind, debugging is the whole process the developper is tackling when trying to find and solve a bug. To log is to write down an entry in a journal.

@fjtirado
Copy link
Collaborator Author

Just my 2 cents but I'd rather go for log rather than debug. In my mind, debugging is the whole process the developper is tackling when trying to find and solve a bug. To log is to write down an entry in a journal.

Yes, I think log is better than debug.

@cdavernas
Copy link
Member

cdavernas commented May 28, 2024

As said above, and mainly because the same can be achieved using what's already here, I recommend we instead create a new ServerlessWorkflow repository, such as serverlessworkflow/functions, or just a directory in the present repository where we can store references to "custom"/"additional" functions.

The feature requested by @fjtirado perfectly demonstrate the definition of an opiniated function used to perform (structured?) logging, using stdout and/or plain txt(and event JSON?) files.

This would be a great way to demonstrate the extensivity of the DSL, while allowing us to set the bases for the "collection/gallery" we discussed about.

Such function could look like this:

#function.yaml
log:
  describe:
    summary: Logs the specified message
  input:
    schema:
      type: object
      properties:
        mode:
          type: string
          enum: [ raw, structured ]
        output:
          type: string
          enum: [ console, stdout, stderr, file ]
        level:
          type: string
          enum: [ trace, debug, information, warning, error  ]
        message:
          type: string
        arguments:
          type: object
          additionalProperties: true
  output:
    schema:
      type: string
  execute:
    sequentially:
      switchOutputMode:
        switch:
          console:
            when: .output == "console"
            then: writeToConsole
          file:
            when: .output == "file"
            then: writeToFile
          ...
       writeToFile:
         run:
           process:
             command: ${ "echo \"\(.message)\"" }
      ...

And to use it:

do:
  log:
    call: log #note that because we dont set an uri, it's assumed to be a core function

@fjtirado
Copy link
Collaborator Author

fjtirado commented May 28, 2024

tion of an opiniated function used to perform (structured?) logging, using stdout and/or plain txt(and event JSON?)

I do not think log is more "opiniated" than calling a shell from the workflow. And it is not true that we can write to the standart output, in a not confusing way, with the current constructs
How do writeToStdout will look like? I do not see a way to define it the yaml

@cdavernas
Copy link
Member

cdavernas commented May 28, 2024

I do not think log is more "opiniated" than calling a shell from the workflow.

The way you are gonna implement the "log" function will be optionated, because it is much more complex and variable than, as you said, calling a shell: shell is the execution of a line of text, the command, and that works that way on all systems, unrestrictively, that way.

Log, like I said, can be plain, or structured (therefore requiring variables), have log levels (that differ from systems to systems), can be written to several outputs, might require to be formatted in a way or another, will be processed in a way or another, might or might not need to be send to an aggregator, etc.

@cdavernas
Copy link
Member

cdavernas commented May 28, 2024

How do writeToStdout will look like? I do not see a way to define it the yaml

That's a very fine question! I'll need time to think about it. On your end, please consider my suggestion. We might yet again find the solution in between 😄

Writting to STDOUT is an edge use case IMO and, while I believe it might be usefull to me as an implementer/tester, I think we can agree that it will probably never be used in 99% of (business) use cases, and will be "opiniated" in some way, therefore restrictive on another.

@fjtirado
Copy link
Collaborator Author

I do not think log is more "opiniated" than calling a shell from the workflow.

The way you are gonna implement the "log" function will be optionated, because it is much more complex and variable than, as you said, calling a shell: shell is the execution of a line of text, the command, and that works that way on all systems, unrestrictively, that way.

Log, like I said, can be plain, or structured (therefore requiring variables), have log levels (that differ from systems to systems), can be written to several outputs, might require to be formatted in a way or another, will be processed in a way or another, might or might not need to be send to an aggregator, etc.

Yes, thats true, but I think the essential part of the log is the message. Where that log is written and if the message is raw or structure (or at what level should be written) should be left to implementor choice, and should not be part of the workflow defintion. What the writer of the flow is trying to achieve is to have some feedback of whats going on inside the flow (it is tipically a dev feature)
About the command, I did not want to discuss because is already there, but this clearly makes the definition file not portable (if the SO does not support the command)

@fjtirado
Copy link
Collaborator Author

time to think about it. On your end, please consider my suggestion. We might yet again find the solution in between 😄

For me the solution is to include the log idiom in the definition language so implementors actually implement it

@cdavernas
Copy link
Member

Where that log is written and if the message is raw or structure (or at what level should be written) should be left to implementor choice

Unhappilly, that does not always apply. In structured mode, you cannot write the interpolated message to STDOUT, you need to have, on one hand, the non-interpolated message (ex: "Hello, {preferredUserName}"), and in another hand, a dictionary of named arguments (ex: { "preferredUserName": "Ricardo Zanini" })

@fjtirado
Copy link
Collaborator Author

@cdavernas What I was proposing is a loose way for users to actually print message to the underlying logging mechanism. Which is the underlying logging mechanism users do not really care, they just want to see the message. If they do not configure the underlying logging mechanism, the message will be tipically printed as raw message to the standart output (which is tipically the console)
The underlying logging mechanism is up to the implementation and I do not see it realistic to try to cover all variant in a language indepedepent spec (I think we agree on that), but I think the spec should have a mechanism to log raw strings (and the implementors provide addifitional configuration) The assumption here is that how the message looks is not part of the spec, it is up to the implementor (as far as the message is there somehow, they are fine)
Printing to stdout is not an edge case, it is a dev case (users want to see the progress of the workflow in the console when exeucting it the first time during testing)

@cdavernas
Copy link
Member

About the command, I did not want to discuss because is already there, but this clearly makes the definition file not portable (if the SO does not support the command)

Yes, you are totally right! That is part of a conversation we had with @ricardozanini, where he was stating the need to restrict the environment workflows/tasks are meant to run/be isolated on.

Since we boast about the fact that the DSL is free, and vendor-neutral, I suggest we stick to the leanest Linux OS(Alpine?). Anyways, as @ricardozanini mentionned in same discussion, we should probably also document that shell command should be isolated in either an environment per workflow instance, or why not an environment per command.

@fjtirado
Copy link
Collaborator Author

About the command, I did not want to discuss because is already there, but this clearly makes the definition file not portable (if the SO does not support the command)

Yes, you are totally right! That is part of a conversation we had with @ricardozanini, where he was stating the need to restrict the environment workflows/tasks are meant to run/be isolated on.

Since we boast about the fact that the DSL is free, and vendor-neutral, I suggest we stick to the leanest Linux OS(Alpine?). Anyways, as @ricardozanini mentionned in same discussion, we should probably also document that shell command should be isolated in either an environment per workflow instance, or why not an environment per command.

Although not related to this issue, I would not have allowed the shell command in the spec. The same we do not have support to access file system or to access a DB (becuase it makes the spec not portable) The reason Im pushing for log is just pragmatical, users of Sonataflow are using sysout a lot and assuming the existance of standard output (or underlying logging framework) is fair (as far as we leave it generic, just the raw messagem thats why I intentionally skipped level, raw/structure and destination of the log, leaving that to the implementor configuration).

Assuming the existance of a shell if fair too, but the problem is that commands are not portable, while raw strings are.

@cdavernas
Copy link
Member

Printing to stdout is not an edge case, it is a dev case

Exactly what I am saying: Dev, as in runtime implementer. Any non-stdout alternatives would be enough for any other user. And as a dev, I will need to export my logs, formatted as explained above, so the raw message is a little more than useless in my personnal use case. Actually, like I said before, in the Synapse use case, I explicitly want to forbid users (even myself or other devs) to write to the STDOUT.

In my opinion, another mechanism that clearly separate the concerns would be more appropriate, even if its cuts off STDOUT (anyways, if your runtime wants to do so, it can). As a Synapse implementer though, I would rather not have to implement that ability for something like 0.5-1% of our user base, and provide them with a golden option to malpractice.

As a matter of fact, I am sure you would never misuse it, let alone in a debug context, but you have developped "against" users probably even far longer than me, and you know what will happen if we let that door open for one the "basic" functionality of a workflow.

Finally, I do not believe "logging" should be a core functionality. "Calls" are intented to interact with services, described by standards (all of them). While some are missing, like you mentionned (i.e.: GraphQL), I do not see the need to add a non-standard, non-communicative, non-active, (imo) opiniated one.

@fjtirado
Copy link
Collaborator Author

fjtirado commented May 28, 2024

Anyway, since we do not have universal consensus on the need for a log idiom in the spec, the alternative is to log extensively in the implementation.
For example, if implementation is done in java (sl4j), when debug mode is enable, users can see all modification in the model and all service responses.
Because I guess the reason because Sonataflow users use sysout custom function a lot might be a consequence of scarce logging on the implementation side. So, Im perfectly fine if we all agree such idiom is not needed and it is implementation task to ensure developers can follow the execution flow through intensive logging. However, Im afraid most implementation will introduce non standard extension to provide users with such facility (which as I said is "portable" because all SO support standard console an raw string). I think this issue deserve a patient vote ;)

@cdavernas
Copy link
Member

cdavernas commented May 28, 2024

Although not related to this issue

Totally agreed! Happy to speak about with you on a discussion. Might be usefull to come up with an ADR about it, too.

the problem is that commands are not portable

Commands are portable if you define a reference OS.

while raw strings are.

Yes, but allowing just raw strings to be outputted to just STDOUT is fitting your use case, but not mine, for example.

As I said, I agree that a log functionality could be needed. I do not agree that is should be restricted to a basic interpolated message, and that its output should be restricted to stdout. I do not think that writting to STDOUT should be possible.

@cdavernas
Copy link
Member

Anyway, since we do not have universal consensus on the need for a log idiom in the spec, the alternative is to log extensively in the implementation.

Yes! We could even extensively describe/define the logging context expected for runtimes, for example, as you said, in debug mode.

I also still believe the log function should be added to the DSL core "extra" functions, but then maybe without the STDOUT approach.

@JBBianchi @ricardozanini What do you guys think?

@fjtirado
Copy link
Collaborator Author

fjtirado commented May 28, 2024

@cdavernas logging function based on command call is lame in my opinion. so I prefer to not include any log at all (and users collect log through http call as in current extension example)
and I think we should revisit the command idiom, since it is not portable and should not be there. Instead of it, we should revisit the use cases and try to model them in a different way, no open a pandora box where uses can basically execute anything supported by the underlying SO (which I guess is a security issue

@cdavernas
Copy link
Member

so I prefer to not include any log at all (and users collect log through http call as in current extension example)

As you wish. And while it's true that echo logging is lame, I believe that file-based one is not, in a dev context, which is achieved by using a shell command.

I think we should revisit the command idiom, since it is not portable and should not be there

Again, it is portable if the OS is restricted (to a free one). But you are right, though, maybe we should get rid of the shell process.

@fjtirado
Copy link
Collaborator Author

@cdavernas About file based one, how it will look?

@cdavernas cdavernas mentioned this pull request May 28, 2024
@cdavernas
Copy link
Member

cdavernas commented May 28, 2024

I think we should revisit the command idiom, since it is not portable and should not be there

I opened #860 to describe your proposal.

About file based one, how it will look?

writeToFile:
  run:
    process:
      command: ${ "echo \"\(.message)\"" >> logs.txt }

@ricardozanini
Copy link
Member

Depends on a resolution of #860

@fjtirado fjtirado closed this Jun 13, 2024
@cdavernas cdavernas mentioned this pull request Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: spec Changes in the Specification change: feature New feature or request. Impacts in a minor version change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Print Task
4 participants