Code is downloaded from apache site and being modified for MTP2
My MTP2 Project
- mere time pe 2 class svm use kiya tha with linear kernel, try with gaussian kernel, linear kaam nahi karega as data increases (ganesh aur sunita ne two class svm use karne bola tha)
- make sure koi bhi tourism page ko positive me daalo
- humne negative set sirf health ka liya tha... which not enough... saare orthogonal categories dhundo aur sabke thode thode urls base set me daalo. humne india ke bahar waalon ko -ve mark kiya tha, which causes error... meri report me error analysis [padh lena...exact details mil jaayenge
- one class better fit hoga yahaa shayad, try to get answer to this question as well, isse jaldi implement karna...10 depths tak results aane ko it takes around 10-12 hrs
- Try Guassian Kernel
- Build Larger Negative Training set
- Koi bhi tourism page ko positive me daalo
- Pos URL Tokens :: Percentage of overlapping URL tokens in the already crawled URLs set.
- Pos Parent URL Tokens :: Percentage of overlapping parents URL tokens in the already crawled URLs parent tokens set.
- Pos Anchor Text of URL :: Percentage of overlapping anchor texts of the URL in the already crawled URLs anchor text set.
- Neg URL Tokens :: Percentage of overlapping URL tokens in the already discarded URL tokens set.
- Neg Parent URL Tokens :: Percentage of overlapping parents URL tokens in the already discarded URLs parent tokens set.
- Neg Anchor Text of URL :: Percentage of overlapping anchor texts of the URL in the already discarded URLs anchor text set.
- Average Parent Score :: Average parent scores of the URL.