Skip to content

suhasm/dictionary-scrapers

Repository files navigation

Dictionaries

Indic dictionary data not (easily) available elsewhere.

Required Cleanup:

  • Some dictionaries have some headwords that indicate alternative forms. For instance, in the haḷagannaḍa padasampada, an entry reads ಕಂಕೇಲಿ(ಳಿ). Such entries are yet to be identified, and separated into two headwords: ಕಂಕೇಲಿ and ಕಂಕೇಳಿ. Not all dictionaries use the same conventions for alternative forms.
  • A small fraction of the Kannada words are corrupt on the Bharatavani site. For instance: the saṃkṣipta kannaḍa nighaṇṭu reads ಕ¾õÉಗೊರಲ, ಕ¾õÉಕಂಠ. These need restoring.

Now Available

  1. [KA] ಪಂಪನ ನುಡಿಗಣಿ (paṃpana nuḍigaṇi) (~12K words)

    A dictionary of words used by the distinguished Old Kannada poet Pampa (900 AD) in his Ādipurāṇa and Paṃpabhārata. Author: Dr. P.V. Narayana. This data has been scraped off Bharatavani. WorldCat entry here. Read a pdf here.

    Sample entry:
    ಬಾಳ್
    [ನಾ] ಕತ್ತಿ (ಕೀಱಿ ನೆತ್ತಿಯೊಳ್ ಬಾಳಂ ನಿರ್ನೆರಮೂಱಿ ಚಲದಿನೆರಗಿಸಲಿರೆ ಭರತಂಗೆಱಗುವೆಱಕಂ ಅಂಜುಮೆಯಲ್ತೇ: ಆದಿಪು, ೧೪. ೭೫)

  2. [KA] ಚಂಪೂ ನುಡಿಗನ್ನಡಿ (campū nuḍigannaḍi) (~30K words)

    A dictionary of Old Kannada based on large number of works (see pp. 5-6). Author: Dr. P.V. Narayana. This data has been scraped off Bharatavani. WorldCat entry here. Read a pdf here.

    Sample entry:
    ಬಾಳ್
    ಕತ್ತಿ (ಕೀಱ ನೆತ್ತಿಯೊಳ್ ಬಾಳಂ ನಿರ್ನೆರಮೂಱ ಚಲದಿನೆರಗಿಸಲಿರೆ ಭರತಂಗೆಱಗುವೆಱಕಂ ಅಚಿಜಮೆಯಲ್ತೇ: ಆದಿಪು, ೧೪. ೭೫)

  3. [KA] ಹಳಗನ್ನಡ ಪದಸಂಪದ (haḷagannaḍa padasampada) (~30K words)

    A dictionary of Old Kannada. It is likely that is there is significant overlap with campū nuḍigannaḍi and paṃpana nuḍigaṇi, since it is the same author. Author: Dr. P.V. Narayana. This data has been scraped off Bharatavani. WorldCat entry here. Read a pdf here.

    Sample entry:
    ಬಾಳ್
    1. ಕತ್ತಿ 2. ಲಾಮಂಚ (ಅರ್ಥಸಂದಿಗ್ಧತೆಯ ಶಬ್ದ)

  4. [KA] ಸಂಕ್ಷಿಪ್ತ ಕನ್ನಡ ನಿಘಂಟು (saṃkṣipta kannaḍa nighaṇṭu) (~43K words)

    A dictionary of Kannada that spans all time periods, published by the Kannaḍa Sāhitya Pariṣattu. I'm not sure which edition this data is based on. There is also a larger multi-volume dictionary, which has not yet been digitised. Publisher: Kannada Sahitya Parishattu. Editor: G.V. Venkatasubbiah (probably; I don't have access to this book). This data has been scraped off Bharatavani. WorldCat entry here.

    Sample entry:
    ಬಾಳ್
    1. ಬದುಕು. 2. ಜೀವ. 3. ಕತ್ತಿ. 4. ಉಳುಮೆ.

  5. [HI] अवधी शब्द-कोश (avadhī śabdakośa) (~9K words)

    A dictionary of Awadhi published by Lucknow University. Editors: सूर्यप्रसाद दीक्षित, सजीवनलाल यादव. This data has been scraped off Bharatavani. WorldCat entry not found. See pdf here.

    Sample entry:
    अंटसंट
    ऊटपटांग, बेसिर-पैर की बात

  6. [KA] ಜಾನಪದ ವಸ್ತುಕೋಶ (jānapada vastukośa) (~250 words)

    A dictionary of objects used in rural Karnataka.ṭ Editor: Sa. Chi. Ramesh. This data has been scraped off Bharatavani. WorldCat entry here. See pdf here.

    Sample entry:
    ಕೊಣಮಿಗೆ
    ರೊಟ್ಟಿ ಮಾಡುವುದಕ್ಕೆ, ಹಿಟ್ಟು ಕಲಸಿಕೊಳ್ಳಲು ಮತ್ತು ಆ ಬಳಿಕ ರೊಟ್ಟಿ ಬಡಿಯಲು/ತಟ್ಟಲು ಬಳಸಿಕೊಳ್ಳುತ್ತಾರೆ. ಇದರ ಆಕಾರ ಮತ್ತು ಗಾತ್ರಗಳಲ್ಲಿ ವ್ಯತ್ಯಾಸ ಇರುತ್ತದೆ. ವೃತ್ತಾಕಾರದ ಕೊಣಮಿಗೆಯು ರೊಟ್ಟಿ ತಯಾರಿಸಿಕೊಳ್ಳಲು ಬಳಕೆಯಾದರೆ ಆಯತಾಕಾರದವುಗಳಲ್ಲಿ ಅನ್ನ ಮೊಸರು ಕಲಸಿಕೊಳ್ಳಲು ಬಳಕೆಯಾಗುತ್ತದೆ. ಇವನ್ನು ಕಗ್ಗಲ್ಲು, ಮರ ಮತ್ತು ಲೋಹಗಳಿಂದ ತಯಾರಿಸುತ್ತಾರೆ. ಸಾಗುವಾನಿ, ಮಾವು, ಅತ್ತಿ ಮುಂತಾದ ಮರಗಳು ತಯಾರಿಕೆಗೆ ಬಳಕೆಯಾಗುತ್ತವ. ರೊಟ್ಟಿ ಮಾಡುವ ಯಂತ್ರವು ಬಂದ ಮೇಲೆ ಕೊಣಮಿಗೆ ಬಳಕೆ ಕಡಿಮೆಯಾಗುತ್ತಾ ಬಂದಿದೆ.

  7. [PA] Pali-Myanmar Abhidhan (~200,000 words)

    "Pali Myanmar Abhidhan is the world's largest Pali dictionary, a massive 23 volumes, with more than 200 000 words, a complete reference guide to the language of the root texts and commentaries. PEU is a project in progress to translate the Abhidhan's definitions into English, currently at about 80% human translated, the remainder is by Google. " From https://pm12e.pali.tools/

Future

  1. Bhāratīya Bhāṣā Kośa (Bharatavani)
  2. English Kannada Nighaṇṭu (Bharatavani)
  3. Nepal German Manuscript Cataloguing Project (NGMCP) Database
  4. Several Kannada Dictionaries on Baraha.com.
  5. Kannada-English Etymological Dictionary by Učida and Rajapurohit (looks like p-acharya is already trying).
  6. Dravidian Etymological Dictionary (Burrow and Emeneau)
  7. G.V. Venkatasubbiah's English-Kannada dict and Kannada-English dict from Kanaja
  8. Dasa Sāhitya Kośa from Kanaja
  9. Old Telugu Dictionaries
  10. This Dravidian Etymological Dictionary. Compiled by George Starostin on the basis of Burrow and Emeneau's work.
  11. New Catalogus Catalogorum
  12. Saraswati Mahal Online Catalog
  13. West Bengal Public Library Database
  14. Whitney's Roots
  15. Tadbhava and Deshi Word list in Nayakumāracariu (Ed. Hiralal Jain)

About

Dictionary data for Indian languages

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages