GSoC 2025 Help: Intergrate Unicode Inflection into MessageFormat 2 #84
-
Hello @grhoten, I am S Devipriya, a second year Bachelor's student in CS interested in the project titled "Integrate Unicode Inflection into MessageFormat 2". I have coding experience in Python, C and C++ and I am familiarising myself with cmake currently. As a passionate language learner, the topic of inflection and it's widespread use cases sparked my curiosity. I have went through the UTW videos and other resources on inflection and developed an idea of this project's scope. As I understand, Unicode's inflection is already integrated to message format 2 for ICU4J and the current project is to do similar work for ICU4C. However, as this is my first time participaing in an Open source project, I would like some help to learn more about the technical aspect of this integration project. It would be greatly appreciated if you could suggest some next steps I can take to develop a deeper understanding of this project. I would also like your advice on if this project is a good entry point for a beginner into Unicode's work on inflection. I'm also interested in "Add support for a new language in Unicode Inflection". I am a native Malayalam speaker so I would be love to work on to adding inflection support for Malayam in Unicode. Please let me know if that project would be a better entry point as it is a medium project compared to the former which is a long project. Thank you for your time! Yours sincerely, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Thank you for your interest. Before working on the project, it's recommended that you register and apply for the project on GSoC so that you can be selected. In the meantime, it's a great idea to explore what this project is about. There is currently no integration between Unicode Inflection and Message Format at this time. There is no Java implementation of Unicode Inflection. So the project is all about C++. There are currently no wrappers in other programming languages at this time either. Presumably there are ways to register the Unicode Inflection capability into Message Format, but it hasn't been tried nor tested that well. So this task will be a learning experience for everyone. The API referenced in #87 is probably the best place to start to scope such work. As far as adding support for new languages to Unicode Inflection, I can't provide much guidance on whether you're a good fit for either project. You should apply for whichever ones interest you the most. If you need inspiration for how to do such work, you may want to look at the existing Hindi support or the existing How to add a new language document. Malayalam does seem to have an adequate number of lexemes. So any edits to the data should hopefully be minimal. |
Beta Was this translation helpful? Give feedback.
Thank you for your interest. Before working on the project, it's recommended that you register and apply for the project on GSoC so that you can be selected. In the meantime, it's a great idea to explore what this project is about.
There is currently no integration between Unicode Inflection and Message Format at this time. There is no Java implementation of Unicode Inflection. So the project is all about C++. There are currently no wrappers in other programming languages at this time either.
Presumably there are ways to register the Unicode Inflection capability into Message Format, but it hasn't been tried nor tested that well. So this task will be a learning experience for everyone. Th…