Do you think it is possible only with Regex?
Here is my try on Go Playground
This is successful with some dirty code
http://play.golang.org/p/YysZCB3vlu
I want expanded Korean characters to be converted a complete letter. For example, "ㅈㅗㅎㅇㅡㄴㄱㅏㅂㅅㅇㅣㅆㅏㅇㅛㅇㅏㅊㅣㅁㅇㅏㄴㄴㅕㅇㅎㅏㅅㅔㅇㅛㅇㅜㅔ" to 좋은값이싸요아침안녕하세요웬
For browser that don't render korean characters correctly:
좋 은 값 이 싸 요 아 침 안 녕 하 세 요 웬
The easy part is that Korean letter can only start with One Consonant + One or Two Vowel. That can be caught with (.([ㅏ-ㅣ])+
).
The challenging part is Zero or One or Maximum Two Optional Consonants that follows the vowel. Another reason why it is hard is that after the maximum two optional consonants, we have another consonants that does not belong the previous letter and this consonants means another start of a new one letter.
Like below:
ㄱㅏㅂㅅㅇㅣ
= ㄱㅏㅂㅅ + ㅇㅣ
= 값 + 이
= 값이
It is possible to catch all the patterns with if-condition and basic regex. But it would be good if I have shorter version of this.
My ultimate goal is to convert "ㅈㅗㅎㅇㅡㄴㄱㅏㅂㅅㅇㅣㅆㅏㅇㅛㅇㅏㅊㅣㅁㅇㅏㄴㄴㅕㅇㅎㅏㅅㅔㅇㅛㅇㅜㅔㄴ" to 좋은값이싸요아침안녕하세요웬
For browser that don't render korean characters correctly:
좋 은 값 이 싸 요 아 침 안 녕 하 세 요 웬