Skip to content

Add support for Malayalam #85

Open
@grhoten

Description

@grhoten

This is a GSoC Project idea.

Difficulty/Size: Medium

Right now, the Unicode Inflection project supports Arabic, Danish, German, English, Spanish, French, Hebrew, Hindi, Indonesian, Italian, Japanese, Korean, Malay, Norwegian, Dutch, Portuguese, Russian, Swedish, Thai, Turkish and Chinese. Supporting more languages is desired.

Here is background material for the Unicode Inflection concepts.

Expected Outcomes

  • Unicode Inflection code will be able to inflect nouns and personal pronouns for the language being supported. Examples include:
    • object + plural → objects
    • city + plural,genitive → cities’
  • Optionally inflect articles, prepositions, adjectives and verbs as necessary for a given language.
  • All tests of supported functionality should pass.
  • Support a language that isn’t already supported that has sufficient Wikidata, examples include:
    • Estonian, Malayalam, Greek, Czech, Norwegian (Nynorsk), Slovak, Ukrainian, Bangla, Punjabi, Polish, Urdu, or Finnish
    • Perhaps others, but the required data would need to be added to Wikidata.
  • The lexical data will be derived from Wikidata. There is an existing tool to generate appropriate lexical dictionaries for each language, and there are examples of other supported languages.

Skills

  • Required: Working proficiency in English
  • Required: Understanding of a language that is not already supported in Unicode Inflection.
  • Required: Experience with writing software on Windows, Linux, or macOS
  • Required: Experience with C or C++
  • Preferred: Experience with cmake
  • Preferred: Ability to edit XML
  • Preferred: Ability to edit data in Wikidata.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions