Jump to content

[APP/MOD] CM7 - LatinIME with Finnish layout and dictionary


Guest sm4tik

Recommended Posts

Guest sm4tik

UPDATE Jul 25

I've removed libjni_latinime.so from the flashable zip, should be safe to try with other roms now. Remember to backup your original LatinIME.apk just in case.

UPDATE Jul 18

The Finnish layout got merged into N135, dictionary is still under review. I finally managed to make the flashable zip in which I included libjni_latinime.so too. You could try this with other gingerbread roms also (NOT TESTED!). Please let me know how it goes.

EDIT: DON'T USE WITH OTHER ROMS,WILL CAUSE FORCE CLOSES. I'll update the zip next week when I'm back home.Thanks KonstaT for your info!

v0.21

Optimized apk

v0.2

Fixed alternate characters

Added Finnish dictionary by Tero Auvinen (the one used with Scandinavian Keyboard)

v0.1

Initial release

Clockwork flashable zip

latinime-fi-signed.zip

md5 5ece57ff41462b985f8bc235fa806df6

Edited by sm4tik
Link to comment
Share on other sites

Guest tonnikala

Nice to hear that somebody is working with this issue.

I looked for a Finnish main.dict and found one with a size of 78 kB (swedish one is a 911 kB). That dictionary was included on http://forum.xda-developers.com/showthread.php?t=875202 this release and on a Multiling keyboard (can be found from a market).

Maybe HTC keyboard could help us with that dictionary?

If we start to make own dictionary from a scratch, here are a few good links:

http://code.google.com/p/softkeyboard/wiki/BinaryDictionaries

http://forum.xda-developers.com/showthread.php?t=1027207

http://kaino.kotus.fi/sanat/nykysuomi/ (nykysuomen sanalista - tästä toki puuttuu sanojen esiintymistiheys, joten voi olla käyttökelvoton)

http://android.git.kernel.org/?p=platform/packages/inputmethods/LatinIME.git;a=blob;f=dictionaries/sample.xml;h=85233b63a8b8a1043fceae592b567b93ee275504;hb=HEAD (painotuksesta asiaa)

Link to comment
Share on other sites

Guest sm4tik

Nice to hear that somebody is working with this issue.

I looked for a Finnish main.dict and found one with a size of 78 kB (swedish one is a 911 kB). That dictionary was included on http://forum.xda-developers.com/showthread.php?t=875202 this release and on a Multiling keyboard (can be found from a market).

Maybe HTC keyboard could help us with that dictionary?

If we start to make own dictionary from a scratch, here are a few good links:

http://code.google.com/p/softkeyboard/wiki/BinaryDictionaries

http://forum.xda-developers.com/showthread.php?t=1027207

http://kaino.kotus.fi/sanat/nykysuomi/ (nykysuomen sanalista - tästä toki puuttuu sanojen esiintymistiheys, joten voi olla käyttökelvoton)

http://android.git.kernel.org/?p=platform/packages/inputmethods/LatinIME.git;a=blob;f=dictionaries/sample.xml;h=85233b63a8b8a1043fceae592b567b93ee275504;hb=HEAD (painotuksesta asiaa)

Thanks for the links. Actually the gingerbread keyboard I'm using is the same keyboard you linked to at XDA. It's only been stripped down to fi and en and the person who posted it to androidsuomi.fi has done changes to the fi layout so it shows all the numbers in top row. The finnish dictionary is usable, but not the best I've seen, I think it was the one in scandinavian keyboard that was quite impressive. Not sure though, it's been a while since I used it last time.

I think I don't have time to start a dictionary from scratch atm, but if someone want's to do it, please do :)

edit: ..sorry I lied, the dictionary is NOT the same, it's over 900K. Uploading an update in a few moments.

Edited by sm4tik
Link to comment
Share on other sites

Guest tonnikala

Hmm..

I checked that scandinavian keyboard. I found a .dict-file and it actually worked with LatinIME.apk keyboard. Maybe we can use that dictionary (or is it allowed)? That dictionary contains 80000 unique weighted words and its size is 973 kB. Sounds good to me. On my testings I just replaced Swedish dictionary with Finnish one.

It is possible to fix that layout problem with copying .xml-files form \res\xml-sv\ to \res\xml-fi\ (within .apk-file)

That didin't got Finnish dictionary working (I made raw-fi folder and put that dictionary there).

To get the source right - I'm not a right person to do it... :)

Edited by tonnikala
Link to comment
Share on other sites

Guest tonnikala

I just found this one: http://forum.xda-dev...ad.php?t=695701

With that tool I managed to get Finnish layout (by copying swedish one's .xml's) and Finnish dictionary working. Now it displays on "Input languages" Suomi(Suomi) - dictionary available :)

You can download modified keyboard from here: http://koti.mbnet.fi/tkala/LatinIME.zip

It uses scandinavian keyboard's dictionary.

Edited by tonnikala
Link to comment
Share on other sites

Guest sm4tik

Updated first post.

v0.2

Fixed alt characters

Added Finnish dictionary

Question about the output directory after building CM7

There is a system directory inside out/target/product/blade, what's the difference between this and the one in the update.zip? I found the LatinIME.apk to be only 8.2M in out/.../system/app while the one in the update.zip is over 12M (after including the Finnish stuff into it). Both seem to work the same?

BTW, how the heck do I edit the thread title with this new forum thing?

Link to comment
Share on other sites

Guest sm4tik

Hmm..

I checked that scandinavian keyboard. I found a .dict-file and it actually worked with LatinIME.apk keyboard. Maybe we can use that dictionary (or is it allowed)? That dictionary contains 80000 unique weighted words and its size is 973 kB. Sounds good to me. On my testings I just replaced Swedish dictionary with Finnish one.

It is possible to fix that layout problem with copying .xml-files form \res\xml-sv\ to \res\xml-fi\ (within .apk-file)

That didin't got Finnish dictionary working (I made raw-fi folder and put that dictionary there).

To get the source right - I'm not a right person to do it... :)

Maybe there's somebody out there who can help us to get that dictionary right. I don't want to include a dictionary which turns out to be something that's copyrighted or whatever. The current dictionary I added is 976K in size, but I have no idea where it's originally from!!

Link to comment
Share on other sites

Guest tonnikala
<br />Maybe there's somebody out there who can help us to get that dictionary right. I don't want to include a dictionary which turns out to be something that's copyrighted or whatever. The current dictionary I added is 976K in size, but I have no idea where it's originally from!!

Yeah. That scandinavian keyboard's Finnish dictionary is made by Tero Auvinen (ta (at) iki.fi). I have no idea abuot GPL's or other Lincenses - what these allows and so on..

Edited by tonnikala
Link to comment
Share on other sites

Guest KonstaT

Thanks for this. You inspired me to make one for myself too. :)

I just noticed last night that scandinavian keyboard with finnish dictionary doesn't work with gingerbread (Ginger Stir Fry). The keyboard itself works but it only suggests words that are in user dictionary.

I used CM7N131 LatinIME.apk and finnish dictionary by Tero Auvinen. You could really trim it down by compressing and optimizing it properly. Mine is less than 8mb with all dictionarys and I also made a version with only english-finnish-swedish and it's just over 2mb.

You could contact Tero Auvinen and ask if you can use his dictionary. It's a free addon to a free application so I think it should be fine as long as your not trying to sell your keyboard. Here is a market link for that: https://market.android.com/details?id=com.android.inputmethod.norwegian.finnishdictionary Are you trying to get this included in CM7?

Your v0.2 seems pretty much finished so may I suggest you make that into a flashable zip.

Good job.

Link to comment
Share on other sites

Guest sm4tik

Thanks for this. You inspired me to make one for myself too. :)

I just noticed last night that scandinavian keyboard with finnish dictionary doesn't work with gingerbread (Ginger Stir Fry). The keyboard itself works but it only suggests words that are in user dictionary.

I used CM7N131 LatinIME.apk and finnish dictionary by Tero Auvinen. You could really trim it down by compressing and optimizing it properly. Mine is less than 8mb with all dictionarys and I also made a version with only english-finnish-swedish and it's just over 2mb.

You could contact Tero Auvinen and ask if you can use his dictionary. It's a free addon to a free application so I think it should be fine as long as your not trying to sell your keyboard. Here is a market link for that: https://market.android.com/details?id=com.android.inputmethod.norwegian.finnishdictionary Are you trying to get this included in CM7?

Your v0.2 seems pretty much finished so may I suggest you make that into a flashable zip.

Good job.

I compared the dictionary I used and the one by Tero and it's a match :)

You're right about optimizing the apk, now it's down to 7,5M. It didn't cross my mind apk's in CM are not optimized by default.. you learn something new every day! I'll look into making it into a flashable zip, that's another thing I've meant to learn for a while now. I think it could be a good idea to make two versions, one "official" with Finnish included and another one like your stripped down version. It was my initial idea to try to get this through upsream, but I think it might a bit of an overkill to add 1M to CM just make the few Finnish guys happy. We'll see where all this will be heading.

Link to comment
Share on other sites

I compared the dictionary I used and the one by Tero and it's a match :)

You're right about optimizing the apk, now it's down to 7,5M. It didn't cross my mind apk's in CM are not optimized by default.. you learn something new every day! I'll look into making it into a flashable zip, that's another thing I've meant to learn for a while now. I think it could be a good idea to make two versions, one "official" with Finnish included and another one like your stripped down version. It was my initial idea to try to get this through upsream, but I think it might a bit of an overkill to add 1M to CM just make the few Finnish guys happy. We'll see where all this will be heading.

I wouldn't worry about the extra 1mb. If they cared about things like that then they could compress their apks better & save a lot more space. They do care about supporting other languages, but they need native speakers to submit patches.

See if you can submit a patch for Finnish support in the keyboard & then it's up to them if they want to include it or not.

Edited by wbaw
Link to comment
Share on other sites

Guest sm4tik

I wouldn't worry about the extra 1mb. If they cared about things like that then they could compress their apks better & save a lot more space. They do care about supporting other languages, but they need native speakers to submit patches.

See if you can submit a patch for Finnish support in the keyboard & then it's up to them if they want to include it or not.

Yeah, I guess you're right. The layout and dictionary are in different branches anyway so I guess this would mean 2 commits? Even if the dictionary wouldn't get merged, I think CM would still benefit for having a fi layout. Lack of it has been mentioned many times in different places and for people new to CM (or android in general) this would mean much less confusion. Atleast it took me quite a while to figure out that it wasn't my fault, the settings were all right but the layout just wasn't available even though it was listed under input languages.

Link to comment
Share on other sites

Guest KonstaT

This is slightly off topic, but it's about finnish localization. If your about to commit stuff to CM7, here is another idea. Changing clock format from HH.mm to HH:mm. Small thing but I know it's been bugging lots of people. This might be happening for norwegians so why not for finns too...

Make Norwegian use HH:MM time format instead of HH.MM

http://review.cyanogenmod.com/#change,6685

Link to comment
Share on other sites

Guest sm4tik

This is slightly off topic, but it's about finnish localization. If your about to commit stuff to CM7, here is another idea. Changing clock format from HH.mm to HH:mm. Small thing but I know it's been bugging lots of people. This might be happening for norwegians so why not for finns too...

Make Norwegian use HH:MM time format instead of HH.MM

http://review.cyanogenmod.com/#change,6685

I'll do that one too now that I'm on this. How about the translation,do you know if there is someone working on it? Any other stuff us Finns are missing?

Anyway, here's the gerrit review link for a Finnish layout. I still have to contact Tero about the dictionary.

http://review.cyanogenmod.com/#change,6742

edit: About the dictionary. If anyone knows whether it should go to

vendor/cyanogen/overlay/dictionaries/packages/inputmethods/LatinIME/java/res

or

vendor/cyanogen/overlay/common/packages/inputmethods/LatinIME/java/res

please let me know!

The dictionary in my build is in the first path (just followed the Swedes). What's the difference between the two?

Edited by sm4tik
Link to comment
Share on other sites

Guest sm4tik

The layout got merged, I've contacted Tero about the dictionary and as soon as I get his approval (I hope) I'll submit it too. Still have to fix that clock format, the fix for Norwegians got merged so I think it won't take long for it to go through. I'll keep y'all updated.

edit: http://review.cyanogenmod.com/#change,6762

I'm not going to fix the time format as it is correct the way it is according to standards. I guess the colon separation is just something we're all just used to, atleast I am.

Edited by sm4tik
Link to comment
Share on other sites

Guest sm4tik

Updated first post.

Now flashable with clockwork. I tested it with CWM 3.0.2.7, but it's the first update.zip I've created, so if there are any prolems with it please let me know!

Edited by sm4tik
Link to comment
Share on other sites

Guest KonstaT

Well done.

I think you could lose the libjni_latinime.so though. Almost every ROM has one already. If you try to install this to Ginger Stir Fry or other ZTE leak based ROMs, all you'll get is force-closes. Libs are not interchangeable because of the 2G-3G vmsplit difference.

Hope to see the dictionary pushed forward to the CM7 too. :)

Link to comment
Share on other sites

Guest sm4tik

Well done.

I think you could lose the libjni_latinime.so though. Almost every ROM has one already. If you try to install this to Ginger Stir Fry or other ZTE leak based ROMs, all you'll get is force-closes. Libs are not interchangeable because of the 2G-3G vmsplit difference.

Hope to see the dictionary pushed forward to the CM7 too. :)

I'm away from home for this week, so I can't fix it.Thanks for the info,I thought there might've been something tricky with that lib..I'll update the first post and fix it as soon as I get back home!

Link to comment
Share on other sites

  • 8 months later...
Guest KonstaT

How to make ICS compatible LatinIME dictionary

Posted this here as it is sort of related. As some might know Gingerbread android keyboard dictionaries are not compatible with ICS LatinIME. Here is a quick guide how to make ICS main.dict.

1. First you need a wordlist balanced on how often different words appear. Save it as wordlist.xml for example. It should look something like below.

<wordlist>
<w f="255">this</w>
<w f="255">is</w>
<w f="128">sample</w>
<w f="1">wordlist</w>
</wordlist>[/code] 2. Then you need makedict. Compile makedict from AOSP/CM9 source.
[code]. build/envsetup.sh
lunch (e.g. cm_blade-userdebug)
make makedict
You'll have makedict.jar on your out directory. I attached prebuilt version below (compiled for x86, should work under windows too, rename .zip -> .jar). makedict.zip 3. Make new dictionary. Copy your wordlist.xml and makedict.jar into same directory.
java -jar makedict.jar -s wordlist.xml -d main.dict

4. Copy your main.dict into your AOSP/CM9 source tree. In case of CM9 it would go to vendor/cm/overlay/dictionaries/packages/inputmethods/LatinIME/java/res/raw-xx/main.dict (where xx is your language code). Compile LatinIME or copy your main.dict into prebuilt LatinIME.apk.

Here is good info on making wordlists etc.

http://forum.xda-developers.com/showthread.php?t=1027207

Here is a trimmed CM9 LatinIME.apk with English, Swedish and Finnish dictionaries (rename .zip -> .apk). I was lucky to find balanced finnish wordlist here:

https://svn.kapsi.fi/ave/android/finnish_dictionary/tools/

LatinIME.zip

Link to comment
Share on other sites

  • 1 month later...
Guest KonstaT

I finally got around and uploaded the Finnish dictionary to gerrit. Patches are here and here. I'm pretty sure almost no one cares so it probably never gets merged. :P At least it's there now so that the few finnish users can pick it up.

Link to comment
Share on other sites

Guest shmizan

very nice work. I'd appreciate it more but I don't know Finnish :P

the current Hebrew word prediction causes LatinIME to die and restart so I'm guessing something's wrong with it.

do you have any idea how to do the opposite thing, from dict to xml with the values of how often words appear?

should it be "java -jar makedict.jar -s main.dict -d wordlist.xml" or this tool couldn't handle it?

edit: I'm experimenting a bit. I built a main.dict using your makedict.zip

the Hebrew wordlist I found is here and it builds okay.

I then replace the output main.dict inside LatinIME.apk and flashed it. then I get a force close. what do you think?

Edited by shmizan
Link to comment
Share on other sites

Guest KonstaT

You can use

java -jar makedict.jar -s main.dict -x wordlist.xml
to extract the wordlist from existing dictionary. Problem is that the output is in wrong form and it looks something like this

<wordlist format="2">
<w word="this" f="225"></w>
<w word="is" f="225"></w>
<w word="sample" f="128"></w>
<w word="wordlist" f="1"></w>
</wordlist>
[/code]

For some reason it is in a format that can't be built back into dictionary. :o Maybe it would be possible to write some script/macro to change the format.

I can't even get that hebrew wordlist to compile into a dictionary, all I have is errors. I think it might be something to do with the text encoding.

Link to comment
Share on other sites

Guest shmizan

oops sorry I linked the bad one (2 duplicated values there). this one compiles fine: http://softkeyboard....ml/he_small.xml

could you test that one?

using the command you wrote there I get errors trying to decompile the original Hebrew dict file from LatinIME:


shmizan@ubuntu:~/Desktop/dict$ java -jar makedict.jar -s main.dict -x wordlist.xml

Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 48

at com.android.inputmethod.latin.BinaryDictInputOutput.readCharGroup(BinaryDictInputOutput.java:781)

at com.android.inputmethod.latin.BinaryDictInputOutput.readNode(BinaryDictInputOutput.java:927)

at com.android.inputmethod.latin.BinaryDictInputOutput.readNode(BinaryDictInputOutput.java:941)

at com.android.inputmethod.latin.BinaryDictInputOutput.readNode(BinaryDictInputOutput.java:941)

at com.android.inputmethod.latin.BinaryDictInputOutput.readNode(BinaryDictInputOutput.java:941)

at com.android.inputmethod.latin.BinaryDictInputOutput.readNode(BinaryDictInputOutput.java:941)

at com.android.inputmethod.latin.BinaryDictInputOutput.readDictionaryBinary(BinaryDictInputOutput.java:993)

at com.android.inputmethod.latin.DictionaryMaker.readBinaryFile(DictionaryMaker.java:188)

at com.android.inputmethod.latin.DictionaryMaker.readInputFromParsedArgs(DictionaryMaker.java:168)

at com.android.inputmethod.latin.DictionaryMaker.main(DictionaryMaker.java:154)

looking at this post that format you posted seems ok?

edit: I converted the dict I build (from the link I posted) back to xml and the same output as you wrote. can't compile either. maybe a different syntax in the makedict?

Edited by shmizan
Link to comment
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.