Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to check if editor has no content/is empty? #401

Closed
oskarwrobel opened this issue Feb 14, 2017 · 45 comments · Fixed by ckeditor/ckeditor5-engine#1656
Closed

How to check if editor has no content/is empty? #401

oskarwrobel opened this issue Feb 14, 2017 · 45 comments · Fixed by ckeditor/ckeditor5-engine#1656
Assignees
Labels
type:feature This issue reports a feature request (an idea for a new functionality or a missing option).
Milestone

Comments

@oskarwrobel
Copy link
Contributor

oskarwrobel commented Feb 14, 2017

In most of cases editor will have an empty paragraph on init. So getData() == '' will be false almost all the time. What if someone adds some spaces or new lines?
For example, you use CKE as an input to write posts or comments and you want to prevent user from submitting form until content.trim() == ''. How to do it?

My first thought was to add an option getData( { trim: true } ) or create method isEmpty( document ) but @pjasiun and @scofalik pointed out that it won't be so easy because there is no common rule for every feature that defines when feature is empty. Eg. Can table be empty? or image?

@oskarwrobel oskarwrobel added the type:question This issue asks a question (how to...). label Feb 14, 2017
@fredck
Copy link
Contributor

fredck commented Feb 15, 2017

This is an important feature. In v4 we simply discard the data on output if it looks like empty.

@mlewand, can you point us to this piece of code in v4?

@mlewand
Copy link
Contributor

mlewand commented Feb 15, 2017

There's this config option called ignoreEmptyParagraph.

When calling editor's getData() it will perform a regexp replace of empty paragraph (or some other elements based on regexp).

As for content that's true that some elements might be recognized as part of empty content, e.g. empty paragraph, div, section etc. However IMHO having an empty table should not mark editor as empty.

@fredck
Copy link
Contributor

fredck commented Feb 15, 2017

Thanks, @mlewand!

I propose to KISS on this and have an approach similar to v4 at data output processing. I would not make it configurable for now.

@Reinmar
Copy link
Member

Reinmar commented Feb 15, 2017

I really don't like that regexp solution (and most regexp solutions :P). Besides, it would need to be implemented in HtmlDP, because only it works with HTML strings. So, we may need a similar solution for the MarkdownDP later on (e.g. to not print empty list items) and for any other DP.

So, first we need to answer whether this is HTML related feature and can be implemented in HtmlDP because it doesn't concern other formats. Or that it's a generic feature and is required for all formats.

The other question is whether we may need to make it in any way adjustable in the future. We could, perhaps, base this on simple schema checks (checking whether elements are blocks and if so ignoring them if empty). If someone would need to filter out more empty content, then they could do this based on their own checks (you can always have your own isEmpty( doc ) function). However, if we hardcode any solution in getData(), we risk that someone will not want to have some data filtered out. In such case, we'll have a problem.

The last argument means that having a separate isEmpty() method would be better. getData() should have a single responsibility – returning data. Checking whether the data is empty should be another method's responsibility.

@scofalik
Copy link
Contributor

My 2c: right now we could do this using view, but it won't be 100% accurate (I mean, it depends on what we will want to achieve).

In view, we have several different element types, but only two of them are "real" data: Text node and EmptyElement node (used by image). Then we have "containers": ContainerElement (and all child classes) and AttributeElement. Then we have UIElement for UI stuff.

Other idea proposed by @pjasiun was to enhance schema with possibility to mark which elements are data and which are storing data. In other words, which elements should be treated as empty if they have no content.

For example, image is always a content, no matter if it is empty or not. Paragraph is empty if it does not have content. But table is more complicated, maybe it should not be treated as empty even if it is.

@fredck
Copy link
Contributor

fredck commented Feb 15, 2017

We have two options:

  1. Make this in 3 lines of code and little talk directly in the HTML DP. Other DPs would have to handle this by themselves.
  2. Make this a long discussion, dozens of lines of code, several days of development and review. It'll be perfect, for the sake of being perfect.

I'm definitely for option 1, which can at any moment by replaced by 2, if 1 will be revealed to be inefficient.

That's my point on KISS.

@mlewand
Copy link
Contributor

mlewand commented Feb 15, 2017

The last argument means that having a separate isEmpty() method would be better. getData() should have a single responsibility – returning data. Checking whether the data is empty should be another method's responsibility.

+1 it makes a lot of sense

For example, image is always a content, no matter if it is empty or not. Paragraph is empty if it does not have content. But table is more complicated, maybe it should not be treated as empty even if it is.

Keep in mind that you also want to check for number of such "empty elements" as 2 paragraphs should not be identified as empty content.

@fredck
Copy link
Contributor

fredck commented Feb 15, 2017

The last argument means that having a separate isEmpty() method would be better. getData() should have a single responsibility – returning data. Checking whether the data is empty should be another method's responsibility.
+1 it makes a lot of sense

I would look what this issue reporter expected. This may drive the expectation of other developers. If they look to an empty editor, they expect that editor to be empty. By not doing this with getData() (which internally may call isEmpty()), we add yet another level of complexity to implementers trying to use our already complex API.

@fredck
Copy link
Contributor

fredck commented Feb 15, 2017

Also, keep in mind that such checks are many times done at the server side, once data is posted.

@Reinmar
Copy link
Member

Reinmar commented Feb 15, 2017

I'm definitely for option 1, which can at any moment by replaced by 2, if 1 will be revealed to be inefficient.

We won't be able to change this once developers rely on it. And if we plan to make such a non-BC changes it, then there's no point in making them in the first place. We were meant to have a quite stable API after 1.0, weren't we?

we add yet another level of complexity to implementers trying to use our already complex API.

An isEmpty() method is what you actually look for if you want to check if editor is empty. I remember couple of questions on SO regarding this – turns out using getData() isn't that intuitive. See e.g. http://stackoverflow.com/questions/3277250/how-to-check-whether-ckeditor-has-some-text-in-it

It's actually quite depressing because they proposed getting a raw text from the editor and it's the most popular answer there ;|. The right answer is 3x less popular.

An SRP rule tells us that a single method should do one thing. This is KISS. Moreover, it lets you compose the methods differently. If the default isEmpty() implementation doesn't satisfy you, you use another one, but getData() is still ok for you.

E.g., I'd KISS regarding isEmpty(). There's no need to make it configurable through the schema. Since we made it a separate method, if you require different rules, you implement your own method. But you don't need to reimplement the whole getData().

@fredck
Copy link
Contributor

fredck commented Feb 16, 2017

Then go around explaining that isEmpty() == true and getData() is not empty :/

@Reinmar
Copy link
Member

Reinmar commented Feb 16, 2017

As we can see, you need to explain how to check if the editor is empty in CKE4's case too. I, personally, believe that a more explicit API is more verbose. As a result, the system is less magical and less confusing.

I guess we can continue this discussion, but I'd like to hear an opinion of people less accustomed to CKE4. After all, this is not for us.

@Reinmar
Copy link
Member

Reinmar commented Feb 16, 2017

Definition – we call an editor empty if it contains an empty paragraph, list item, heading or another kind of flow element. No text, no images, not sure about tables or other rich content like e.g. simplebox widget.

Pick one:

  1. When the editor is empty getData() should return an empty string.
    • Rationale – it allows checking if the editor contains no content just by checking if returned data is an empty string.
  2. When the editor is empty getData() should return the data as it is (e.g. <h2></h2>) and developer should be able to check if the editor is empty using a separate isEmpty() method.
    • Rationale – having separate getData() and isEmpty() applies single responsibility principle and lets you e.g. to use a different logic for "is empty".

Note: Consider the fact that definition of the empty editor is not definitive. It's our current assumption and may differ per case.

@scofalik
Copy link
Contributor

scofalik commented Feb 16, 2017

I'd go with second because first option gives us nothing and isEmpty() implemented like this will almost always return false, cause we almost always have <p> element there.

Additionally, I'd consider naming the method hasNoContent() to make it more explicit. Also I think this should be Editor method (duh) and specific implementations should be able to overwrite it (if your editor has some weird condition where it is considered that there is no content in it).

Edit: I mean, above is pretty obvious, but what I want to highlight is a fact that specific implementation may have it's own rules and we should treat "default content" or "no content" as something "configurable" and solution-specific.

@fredck
Copy link
Contributor

fredck commented Feb 16, 2017

I'm for option 1 (surprise!) as it is closer to the way humans think and also to enable out of the box server side empty check.

getData() should use isEmpty(). If one would ever dislike the way getData() behaves, they can override isEmpty() to match their needs.

@mlewand
Copy link
Contributor

mlewand commented Feb 16, 2017

I'm for option number 2. - isEmpty() method (or getter). We should avoid doing magic with getData() and make it as simple and predictable as possible. isEmpty would have it's clear purpose, and would be the only way to check it.

Also it's going to be much easier to find in docs, and for that reason I'm for staying with isEmpty because IMHO the very first thing you'd be looking for in docs is "empty" phrase.

@fredck
Copy link
Contributor

fredck commented Feb 16, 2017

I repeat myself here, to underline this problem... please do not ignore the server side empty check problem.

@fredck
Copy link
Contributor

fredck commented Feb 16, 2017

Also, it would be great to see reports of issues like "getData() is empty when the editor is not empty" in CKEditor 4. @mlewand, do you have anything?

@Reinmar
Copy link
Member

Reinmar commented Feb 16, 2017

I repeat myself here, to underline this problem... please do not ignore the server side empty check problem.

Not a problem. This is an integration aspect. The data must not be sent to the server (or empty string/whatever must be sent) if the editor is empty. It's JS sending the data most of the time. And even if we consider textarea replacement case, the "is empty" check can be considered in this case. But then it's only an optional end-feature's trait. Not a hardcoded logic which affects every possible use case.

@oskarwrobel
Copy link
Contributor Author

oskarwrobel commented Feb 16, 2017

It's still a pity that I can't easily trim data returned by the editor. I want to get data from the form element and send to the server without empty paragraphs or/and white spaces at the end of content. I don't have too much experience with using RTEs I'm more familiar with pure form elements and for me it is natural to trim the content.

@oskarwrobel
Copy link
Contributor Author

oskarwrobel commented Feb 16, 2017

If we have a helper to "trim" the data it will solve the problem with checking if data is empty.

const isEmpty = trimmedData == '';

exactly the same as checking if form element value is empty.

@Reinmar
Copy link
Member

Reinmar commented Feb 16, 2017

It's still a pity that I can't easily trim data returned by the editor. I want to get data from the form element and send to the server without empty paragraphs or/and white characters at the end of content. I don't have too much experience with using RTE's I'm more familiar with pure form elements and for me it is natural to trim the content.

This is a very interesting point, in fact. You're right that a normal getData() call, even if it would ignore empty blocks, it'd still not be enough. I agree that either a param or a helper would be nice. A param would makes sense, IMO.

@Reinmar
Copy link
Member

Reinmar commented Feb 16, 2017

I think that @oskarwrobel's point answers the question and our entire discussion went in the wrong direction :D. Trimming what would be trailing whitespaces and new lines in plain text or Markdown is indeed necessary. What's important – it can't be implemented in a reasonable way as a separate step to be executed after getData(), so it must be a part of getData(). But since the option is even more intrusive than what CKE4 implements, it's even more apparent that it must be optional.

  1. Question – should it be on or off by default? I can agree with both, but I'm leaning towards off because it means that the option will be { trim: true }, not { dontTrim: true } :D. What may be even more important is that having it off by default means that whoever will use it, will do that consciously. This means that some potential incorrect or surprising behaviour will be easier to understand.
  2. This would be a feature working on the model or on the view (we need to decide which layer gives better information). Besides being used by the data pipeline, it can also be exposed as a separate tool (or DataController's method).
  3. We can add a separate isEmpty() method which checks getData( { trim: true } ) === '' to address the issue with potential bad discoverability of this feature.

WDYT?

@pjasiun
Copy link

pjasiun commented Feb 20, 2017

Question – should it be on or off by default? I can agree with both, but I'm leaning towards off because it means that the option will be { trim: true }, not { dontTrim: true } :D. What may be even more important is that having it off by default means that whoever will use it, will do that consciously. This means that some potential incorrect or surprising behaviour will be easier to understand.

Both fine for me.

This would be a feature working on the model or on the view (we need to decide which layer gives better information). Besides being used by the data pipeline, it can also be exposed as a separate tool (or DataController's method).

Hard to say. On one hand, it would be more correct on the model, since it's a model what have a semantic information in the scheme. On the other, the view we can modify in the getData processing because it is created only temporary. To do the same on the model, we would have to clone it. On the other hand, if all we want to do is to trim the end of the content, we can limit the range of the model data to convert.

We can add a separate isEmpty() method which checks getData( { trim: true } ) === '' to address the issue with potential bad discoverability of this feature.

For sure there should be a separate isEmpty() methods.

@pjasiun
Copy link

pjasiun commented Feb 21, 2017

After reviewing some code which do:

editor.document.on( 'change', () => {
     const data = editor.getData();
     
     saveButtonView.isEnabled = isNotEmpty( data );
} );

I realized that the isEmpty should not be based on the getData. Users want to check if the content is empty with every change, it should be observable. And getData is a pretty heavy operation, comparing it to checking if the content is empty. In my opinion, we should have isEmpty method or observable empty property on editor.document, which can be easily bound to the submit button.

Note that checking if the document is empty can be done very quickly, without involving the whole data processing.

@Reinmar
Copy link
Member

Reinmar commented Feb 21, 2017

This is a separate requirement and I don't think that it's any important now. You can easily throttle the change event to not cause any serious performance issues.

So, to conclude, we need this now:

  • Editor#getData( { dontTrim: true } ) which also means adding some lower level utils to perform that trimming (most likely the utils should work on the model). I'm not sure how the exact implementation could look like, though.
  • Editor#isEmpty() for the general use. The method will be a shortcut to Editor#getData( { trim: true } ) === ''. If anyone wants a different semantics (e.g. no trimming), then they can achieve it. Let's make sure to mention that in isEmpty's documentation.

In the future, we can add:

  • Lightweight check if the document is empty. This, most likely, should use the same mechanisms which will be used by the methods we'll introduce now.
  • More tools if needed. E.g. character counter (strongly related to trimming tools) needed by sooo many people :).

PS. Some ideas regarding implementation. The trimming mechanism could be a function which gets model element (root) and returns a cloned tree with the trimmed content. Such function will be super useful – e.g. the future document#isEmpty() method will be able to use it. It'd actually look like:

isEmpty( rootName ) {
    return isEmpty( trim( this.getRoot( rootName ) ) );
}

It may also be possible for this method to cache the result until any #change. But that's future.

@fredck
Copy link
Contributor

fredck commented Feb 21, 2017

Editor#getData( { trim: true } )

At the current stage, making { trim: true } configurable is overengineering. We don't even know if anyone needs this (and no real world scenario has been exemplified so far) and it can be included at any moment in the future based on demand.

@Reinmar
Copy link
Member

Reinmar commented Feb 21, 2017

At the current stage, making { trim: true } configurable is overengineering. We don't even know if anyone needs this (and no real world scenario has been exemplified so far) and it can be included at any moment in the future based on demand.

But we need to enable it now (as @oskarwrobel requested), so we can easily make it configurable already. This is extremely cheap and will prevent situations when someone wants to configure it, but needs to wait a month until we release a new version. In fact, having many topics on our heads we usually need to delay such little fixes for a longer time which is certainly super frustrating for the community (we're talking here about hardcoding a certain behaviour, without an option to workaround this; I wouldn't oppose in a different case). Hence, since we're already thinking about this, let's not force people to wait.

The only question is whether this should be on or off by default. I'm fine with both, @pjasiun is too, but I can see that the pressure to just trim the data by default is high, so let's enable it. I updated my above proposal to match this change.

@qcn
Copy link

qcn commented May 22, 2017

@Reinmar has this been implemented yet? We have a problem where we want to keep the contents of an empty editor, because it has formatting applied that we want to keep, eg. <p style="text-align: center;"></p>. But getData throws away the empty tag, which is not desirable. I'm pretty sure the proposed separation of getData and isEmpty behaviour would solve our issue.

@Reinmar Reinmar modified the milestones: backlog, next Jun 6, 2018
@Reinmar
Copy link
Member

Reinmar commented Jun 6, 2018

Sorry to hear that. We actually stumbled upon this ourselves that's why this issue was discussed already. But there's always something more important to work on.

I bumped up this ticket to the "next" milestone. It does not mean we'll be able to work on this in the next iteration but it should speed up things anyway.

@ghost
Copy link

ghost commented Jul 8, 2018

It would be useful also a config prop like: ignoreEmptyParagraph.

Using it with React would be amazing! Thanks!

@Reinmar Reinmar modified the milestones: next, iteration 20 Jul 9, 2018
@Reinmar Reinmar changed the title How to check if editor has no content/is empty ? How to check if editor has no content/is empty? Aug 16, 2018
@Reinmar Reinmar modified the milestones: iteration 20, iteration 21 Sep 17, 2018
@ma2ciek
Copy link
Contributor

ma2ciek commented Sep 27, 2018

@jonnitto
Copy link

Is there a date where a bugfix for this gets released? I see this bug is quite critical, as the content repository get "dirty" like that.

@Reinmar
Copy link
Member

Reinmar commented Jan 22, 2019

@f1ames has it as a high-priority ticket in the current iteration together with https://github.com/ckeditor/ckeditor5-utils/issues/269. I can't promise anything, but it seems that we'll finally get it done and released in mid-February.

@piuccio
Copy link

piuccio commented Jan 23, 2019

I just stumbled upon this, until it gets released my dirty hack is to

const isEmpty = (data) => {
  const div = document.createElement('div');
  div.innerHTML = data;
  return !div.textContent.trim();
};

Works for me because I don't want to save anything in database unless there's at least one readable character, I don't care about images or tables or whatnot.

I do hope a proper fix will be implemented in mid Feb, thanks guys

@f1ames
Copy link
Contributor

f1ames commented Jan 28, 2019

I started to wonder how trimming should actually work - what content should be trimmed. There are many cases when content is empty-like or is "visually" empty. Here are some examples:

Empty content

  1. Empty editor <p>&nbsp</p>
  2. Nested empty tags <h2><strong>&nbsp;</strong></h2>.
  3. Empty styled content <h2 style="text-align:right;">&nbsp;</h2>.
  4. Empty lists <ul><li>&nbsp;</li></ul>.
  5. Empty blockquote <blockquote><p>&nbsp;</p></blockquote>.

1 - 3 should be considered empty and trimmed so getData() returns empty string. Also I'm for trimming 4 - 5 too (however as this elements are visually different than empty editor - list bullet and blockquote mark - we may reconsider it).

Multiple empty blocks / empty blocks on content boundaries

Content mentioned in Empty content section is considered empty (and should be trimmed). What about cases when such content is repeated/duplicated:

  1. <p>&nbsp;</p><p>&nbsp;</p><p>&nbsp;</p>
  2. <ul><li>&nbsp;</li><li>&nbsp;</li></ul>

If a single such element is treated as empty, multiple empty elements are still empty (something like multiplying by 0) so should be trimmed IMHO.

or appears on the beginning/end of the content:

  1. <p>&nbsp</p><p>Foo</p>
  2. <p>Foo</p><p>&nbsp</p>
  3. <p>Bar</p><ul><li>&nbsp;</li><li>&nbsp;</li></ul>

It is a little similar to cases with multiple spaces on the beginning/end of the content, which should be trimmed:

E.g. 3 spaces at the beginning of the content may be considered whitespaces to be removed when there are no markers on them but may need to be preserved if they are marked (e.g. commented)

so maybe elements considered empty should be also trimmed from beginning/end of the content. I'm not sure TBH.

Spacing

Should whitespace sequences (marked as \s+ below) be treated the same as &nbsp; filler? For example:

  1. <p>\s+</p>
  2. <h2><strong>\s+</strong></h2>
  3. <ul><li>&nbsp;</li></ul>

such can appear also on the beginning/end of the content:

  1. <p> Foo Bar </p>
  2. <p> Foo <i> Bar </i></p>
  3. <p>A </p><p> Foo <i> Bar </i></p>

Again from

E.g. 3 spaces at the beginning of the content may be considered whitespaces to be removed when there are no markers on them but may need to be preserved if they are marked (e.g. commented).

I assume that in all cases above spaces or entire elements should be trimmed.

Other elements

There are some elements which should not be trimmed even if they are empty and only elements in the editor content such as:

  • img
  • table
  • figure (img and table are placed inside figure by default)

I just wanted to point out that the origin of this issue is <p>&nbsp;</p> so handling only this case would probably satisfy majority of the cases. But since we went into direction of trimming content it will be good to consider other cases as well. We may KISS (by trimming <p>&nbsp;</p> only) or try to handle it in more general way. RFC (especially on the specific cases) @Reinmar @pjasiun @oskarwrobel

@Reinmar
Copy link
Member

Reinmar commented Jan 28, 2019

I'd define a non-empty content differently – it's content which has:

  • any EmptyElement
  • any Text which matches /\S/ (has at least one non-white-space character)
  • any non-plain ContainerElement (e.g. a widget) - no idea how to check for those, though :|
  • any non-plain AttributeElement (e.g. a comment) – to be verified by @oskarwrobel or @pjasiun how comments are actually downcasted to data – again, no idea how to check for those

And that's it. Anything else is noise – any attribute/contrainer element with no content or not being a content itself can be trimmed.

So, the only problem is how to define "non-plain container/attribute elements". This may require setting some property on them.

@4DallasC
Copy link

If you have jQuery you can use:-

var message = window.editor.getData(), textOnly = $(message).text().trim();
if(textOnly != ''){
    //save your data
}
else {
   //show error message
}

@Reinmar
Copy link
Member

Reinmar commented Feb 12, 2019

When discussing how to implement this functionality we started having doubts regarding trimming whitespaces. Returning empty string when the editor has no visible content (e.g. a single empty paragraph) is one thing and it was implemented in ckeditor/ckeditor5-engine#1656, but returning content with whitespaces trimmed at the start and end is more tricky.

It's tricky because we need to either modify some data structure on which we work to remove those whitespaces from it or do it on the fly when converting model to the view or view to the DOM.

Doing this on the fly during m->v conversion is hard because of markers and other ranges and positions that might be affected.

It seems a bit easier to clone the model with all markers, trim it and them convert. Still – it's a challenge because there may be a logic tracking those markers and the model structure and this logic would need to be cloned too (i.e. applied to the cloned structure).

Finally, this could theoretically be done on the view, but doing this there means losing a lot of information. The quality and stability of such a solution would be lower than if this was done on the model.

Therefore, taken these issues, we decided to focus now on simply returning this empty data string when there's no visible content in the editor. We won't work on trimming for now.

Reinmar added a commit to ckeditor/ckeditor5-engine that referenced this issue Feb 14, 2019
Feature: Introduced whitespace trimming to `Model#hasContent()`. `DataController#get()` method can now trim empty data (so it returns empty string instead of `<p>&nbsp;</p>`). Closes [ckeditor/ckeditor5#401](ckeditor/ckeditor5#401).

BREAKING CHANGE: `DataController#get()` method now returns an empty string when the editor content is empty (instead of returning e.g. `<p>&nbsp;</p>`).
@minaee
Copy link

minaee commented Dec 6, 2020

config.ignoreEmptyParagraph = is not working in Django CKEditor

I'm using ckeditor in my django project. I added the following code at the end of html file:

<script type="text/javascript" src="{% static "ckeditor/ckeditor-init.js" %}"></script>
<script type="text/javascript" src="{% static "ckeditor/ckeditor/ckeditor.js" %}"></script>
<script src="//cdn.ckeditor.com/4.15.0/full/ckeditor.js"></script>
<script>
    CKEDITOR.replace( 'message' );
    CKEDITOR.config.allowedContent = true;
    CKEDITOR.config.removeFormatAttributes = '';
    CKEDITOR.config.ignoreEmptyParagraph = false;
</script>

I want to stop users from submitting empty messages.

@FilipTokarski
Copy link
Member

@minaee CKEditor 4 and CKEditor 5 are two different apps. If you have a problem with CKE4, please report your issue in CKEditor 4 issue tracker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature This issue reports a feature request (an idea for a new functionality or a missing option).
Projects
None yet