We've worked on a topic modelling feature over explanations of problems, the topics are figured offline by making use of the LDA algorithm.
Latent Dirichlet Allocation. The purpose of LDA is to learn the representation of a fixed number of topics, and given this number of topics learn the topic distribution that each document in a collection of documents has.
For the dataset we made use of the explanations of all the problems using the api
The five topic we've discovered are : -
- Syntax error
- not valid data type/ too many arguments / variable not intialized.
- documentation not set/ variable property not set
- function not defined, function is not returning the correct item
- unable to import