-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature/beam search op #5052
feature/beam search op #5052
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- My concern is that comments in this PR are hard to follow for users.
- I am not sure it can work well with other dependent operators, because it lacks test and modifications of some merged operators may be required, so are we going to merge this PR right now?
paddle/operators/beam_search_op.cc
Outdated
AddAttr<int>("beam_size", "beam size for beam search"); | ||
|
||
AddComment( | ||
"This is a beam search operator that help to generate sequences."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems oversimple, making it hard to understand how this operator works for most users (I guess it is also hard to understand for reviewers).
For example:
- what is the format of its inputs/outputs?
- what are the operators it should be co-work with (this operator itself cannot complete the entire beam search process)
and so on ...
paddle/operators/beam_search_op.cc
Outdated
framework::OpAttrChecker *op_checker) | ||
: OpProtoAndCheckerMaker(proto, op_checker) { | ||
// inputs and outputs stored in proto | ||
AddInput("ids", "the candidate ids"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I prefer each comment is a complete sentence. I think it is better to use the upper casing for the first letter in a sentence, and add period at the end sentences, and make sure the comment is a meaningful sentence(easy to understand) rather than a self-defined phrase.
- I prefer to use the unabbreviated word
index
instead ofids
in the comment. Because I think the latter one is not a commonly used abbreviation for everyone.
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation of the license has some problem. It should be like that in the accuracy_op.h
. (Confirmed this with qingqing.)
paddle/operators/beam_search_op.cc
Outdated
return true; | ||
} | ||
|
||
void BeamSearchAlgorithm::CollectSelectedResult( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I prefer to use BeamSearch
instead of BeamSearchAlgorithm
. But if using BeamSearch
, the name seems to be confusing with the operator's name.
My concern is that:
- In most references I have read, people usually user "beam search", and "beam search algorithm" seems a little strange to me (This is a personal feeling, I think maybe we can listen to other reviewer's advice).
- This operator has only been involved in two steps in the entire beam search process: "the beam expansion" and "beam shrink". Itself cannot complete the entire generation process. It has to co-work with other operators and the framework. So for me,
BeamSearchAlgorithm
seems to be a "too large word".
paddle/operators/beam_search_op.cc
Outdated
auto *scores_data = selected_scores->data<score_t>(); | ||
|
||
for (size_t high_id = 0; high_id < ids_->NumElements(lod_level_); | ||
high_id++) { // source sentence level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Personally, I like to use
i++
instead of++i
, but admittedly, there is little difference wheni
is an integer. (The google style also says that there is no reason to give preference wheni
is an integer, so I think there is no strong reason to change this.) - The comment
// source sentence level
here is an incomplete sentence. We have had an online discussion, so I can understand it, but I am worried it may be hard to follow for many other users.
paddle/operators/beam_search_op.h
Outdated
* This is an implementation of beam search algorithm, input the candidate ids | ||
* and candidate scores, it will sort and select the top beam_size candidates, | ||
* return the corresponding ids and scores. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this kind of information can also be added into the operator's comments to let users know some details of this operators: the input of this operator expects and what it outputs.
As a user, I certainly also want to know (at least) the below questions:
- Where are the
candidate ids
and thecandidate scores
from? Which operator generates the candidate ids and the candidate scores? - Who will use the returned results of this operator?
using id_t = size_t; | ||
using score_t = float; | ||
/* | ||
* Input the arguments that needed by this class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comments lack of useful information. This makes me think what are the meanings of these inputs? and who produces these inputs(this determines how I can use this operator).
* It stores the corresponding scores of candidate ids in selected_ids. | ||
* | ||
* Return false if all the input tensor is empty, in machine translation task | ||
* that means no candidates is provided, and the task will stop running. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- a small grammatical mistake: Return false if all the input tensors are empty.
- Is it necessary to make the comments/this operator highly depends on the machine translation task?
paddle/operators/beam_search_op.h
Outdated
const std::vector<std::vector<Item>>& inputs); | ||
|
||
/* | ||
* For each source, select top beam_size records. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For each unfinished prefix.
I think we can avoid using "source" because it is not a common concept in standard beam search. The source
is used in the seq-to-seq model, but beam search can be used more than that.
paddle/operators/beam_search_op.h
Outdated
auto& scores = scores_var->Get<framework::LoDTensor>(); | ||
size_t level = Attr<int>("level"); | ||
size_t beam_size = Attr<int>("beam_size"); | ||
BeamSearchAlgorithm alg(ids, scores, level, beam_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I think BeamSearchAlgorithm is not a good name. I wrote my reasons in the above comments.
… feature/beam_search_op
paddle/operators/beam_search_op.h
Outdated
* ids: | ||
* LoD (should have 2 levels) | ||
* first level: [0, 1, 3] | ||
* second level: [0, 1, 2, 3, 4] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[0, 1, 2, 3, 4] should be [0, 3, 6, 9]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the first level should be [0, 1, 4]
, the second level is right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
the beam search operator for source decoder.
fixes: #5018
leave the python test in another PR, after source decoder is ready.