Guest Chainfire Posted October 1, 2009 Report Posted October 1, 2009 (edited) Several new devices have an FPU these days (VFP in this case), a coprocessor that can speed up floating point (fractioned numbers) calculations. Devices that have an FPU include devices based on the SC6410 processor like the Samsung Omnia II/Pro and Acer M900 and Snapdragon based devices like the Toshiba TG01 and HTC Leo. However, Windows Mobile right now does not come with support for it. NuShrike and myself decided to do something about this, and FPU Enabler is the result. FPU Enabler is an application that patches coredll in-memory and replaces some of the FPU emulation routines with actual FPU routines, all applications will automatically make use of this. Now, obviously there are a number of caveats with an application like this. First, the FPU code is not IEEE compliant. This means that in some edge cases calculation results are undefined, which may cause issues. Exceptions are not supported - so for example a divide by 0 will not raise an error, which may be problematic. If your device acts as the control board for a nuclear power plant, we would definitely advise against using this app. Not nearly all instructions that can be speed-up by FPU use are supported (yet). They may be in the future. Actual real-world effects will depend heavily on the application used. You'd have to look hard to notice it, for most applications. This may change as (if) more instructions become supported. Some say Crayon Physics seems a bit snappier, though. Because of the way this is patched in, you aren't actually reaching the full speed possible with hardware FPU. Also, devices not running in "ALLKMODE" already are patched to run this way. As we haven't really found a good way to patch context switching code yet, during FPU instructions interrupts are disabled and an extra jump into patch-code is required. Disabling interrupts requires KMODE, and thus we patch everything to run this way. What we really need is for Microsoft, Samsung, Toshiba, HTC, etc to just simply enable FPU support in their kernel builds. That would solve pretty much all the issues, and be quite a bit faster even. We know CE6 supports it, and it is rumored WM7 will as well, but it would be great if they would put in support in new 6.5 builds :) Instructions Unpack the zip file somewhere Copy the EXE and the DLL to \ on your device Run the EXE Click Patch button If the Patch app says so, close it and restart it, click Patch again. Wait until "Done!" Keep the EXE running. Closing the EXE will "unpatch" the FPU instructions again.Credits Chainfire - Patcher code NuShrike - FPU code cmonex - Help with patch theory no2chem - Help with patch theory Samsung S3C6410: Samsung Omnia II - main test device Samsung Omnia Pro - untested Acer M900 - tested and works Acer F900 - untested Acer X960 - untested Qualcomm Snapdragon: Toshiba TG01 - quick test done by cmonex HTC Leo - untestedPatcher notes DirtyBench is not really (or "really not") a reliable benchmark and does not benchmark all functions - hence the name. Instructions are benchmarked and then selected for patching or not. S3C6410 users will likely see about 17 functions patched, and 13 functions unpatched (those are very simple functions). Most effect will be seen in MUL and DIV instructions.Currently patched functions [*] __eqs [*] __ges [*] __gts [*] __les [*] __lts [*] __eqd [*] __ged [*] __gtd [*] __led [*] __ltd [*] __adds [*] __subs [*] __muls [*] __divs [*] __addd [*] __subd [*] __muld [*] __divd [*] __itos [*] __itod [*] __utos [*] __utod [*] __stoi [*] __stou [*] __stod [*] __dtoi [*] __dtou [*] __dtos Other functions may follow in the future. Known issues [*] None at the moment Feel free to report issues, I'm not assuring you we will fix them, but they're interesting to know anyways. Changelog 0.70 [*] Seperate KMODE and COREDLL patches [*] Use FPUEnabler.dll instead of op_fpu.dll [*] Unpatch FPU calls on exit [*] Fixleak on exit [*] __eqs, __eqd, __ges, __ged, __gts, __gtd, __les, __led, __lts, __ltd, __utos, __utod, __stou, __dtou added Remember this is proof-of-concept code, it may not actually be very useful, and may have adverse effects. It's "because we can" code. New tricks were learned!FPUEnabler_0.70.zip Edited October 4, 2009 by Chainfire
Guest atifarkas Posted October 1, 2009 Report Posted October 1, 2009 I made some test with coreplayer benchmark. I didnt observe any speed increasing. There is any real demonstration what is show why is this good? :)
Guest Chainfire Posted October 1, 2009 Report Posted October 1, 2009 (edited) Various benchmarks we have run have a few % faster results. Probably mostly unnoticable to the naked eye. But that isn't the point - the point is that our devices can be faster with a very simple change to the kernel (from MS' viewpoint) - it's pretty much a 3-liner - and there isn't any good reason it isn't. True FPU support would see a much bigger difference than a patch like this can ever make come true. However also keep in mind that right now we only support a small number of instructions, there are many still missing. Before we can see real effects (even with patch) we'll have to implement more functions. And even then, it depends on what the bottleneck is in an app. We originally started on this for the GL 1.x layer, because it makes very heavy use of floats. How much floats CorePlayer uses, who knows. Hence the: Remember this is proof-of-concept code, it may not actually be very useful, and may have adverse effects. It's "because we can" code. New tricks were learned! Edited October 1, 2009 by Chainfire
Guest GinKage Posted October 1, 2009 Report Posted October 1, 2009 Seems that other side effects include crashing of Cube and Volume rocker popup after suspend/resume.
Guest daskalos Posted October 1, 2009 Report Posted October 1, 2009 Tried it with Acer M900 And it really pushes floating point to the roof, leaving others behind :P ... Great work guys. Great to see great developments :D (hope the crashing issue in m900 can be easily fixed :) )
Guest mechcool Posted October 1, 2009 Report Posted October 1, 2009 (edited) Million Thanks to Chainfire, NuShrike, cmonex, no2chem. Your contributions are greatly appreciated. Though I might not be using this anytime soon. But I sure am happy to know that there are people spending their precious time on enhancements like this. One again, Thank you. Greatly appreciate all the hardwork you guys have put in.... :-) Edited October 1, 2009 by mechcool
Guest NuShrike Posted October 1, 2009 Report Posted October 1, 2009 My tests over at XDA: http://forum.xda-developers.com/showpost.p...mp;postcount=85
Guest Chainfire Posted October 1, 2009 Report Posted October 1, 2009 GinKage I have been able to replicate rocker/crash only once.. can you elaborate exactly what happens? Both the cube and screen lock crash I cannot replicate either on my device. Nice stats daskalos :) We're still looking into a fix for the M900 specific issue... it may actually solve these crash issues as well.
Guest Chainfire Posted October 1, 2009 Report Posted October 1, 2009 Oh and... all problems right now are suspend/resume problems, correct?
Guest NuShrike Posted October 1, 2009 Report Posted October 1, 2009 (edited) I made some test with coreplayer benchmark. I didnt observe any speed increasing. There is any real demonstration what is show why is this good? :)Because most good programs have learned to avoid SLOW floating-point math, you probably won't see much difference in coreplayer or other high-performance programs. However, programs such as 3D can't avoid needing sub-precision numbers or requires complex math (matrices, sqrt), this is where the big boost will be. An example is the Qt4 Framework whose advanced graphics paths are entirely floating-point based so the Windows Mobile port will be much faster now. Edited October 1, 2009 by NuShrike
Guest Chainfire Posted October 2, 2009 Report Posted October 2, 2009 Alright! It certainly looks like we have located and fixed the suspend/resume problem apparent in some programs on the Omnia II as well as the M900 in general. Test version works, now we just need to fix it up for release (and have you guys test it). GinKage, hope to see you on IRC tomorrow so you can do some last tests as well. For now, nap time!
Guest NuShrike Posted October 2, 2009 Report Posted October 2, 2009 Omnia II as well as the M900 in general.This includes any device running the S3C6410 which pretty much includes all the Samsung-cpu based Acers recently such as F900, X960, etc etc.
Guest Albertri Posted October 2, 2009 Report Posted October 2, 2009 Really hands down to all you guys doing all these enhancement and discovering hidden potential in our devices!. Keep up the good work if you guys need some tester just let us know I will be one of them queuing up. :) One quick questions will this patch benefit the 3D games? rightnow I have Ferrari GT Evolution installed in my O2
Guest Chainfire Posted October 2, 2009 Report Posted October 2, 2009 v0.70 is out... crash problem should be fixed!
Guest daskalos Posted October 2, 2009 Report Posted October 2, 2009 A fix already? :) how fast, I thought releasing the fix will take about another day or two :D Testing and running it right now, yup crashing from suspend/resume is fixed :P , just will keep this running for a while to monitor things... I notice that upon suspend/resume, the app suspends and resumes too...'Bout closing this, does it still need soft reset or exits by just tapping "ok"?
Guest Chainfire Posted October 2, 2009 Report Posted October 2, 2009 Well the fix could have been better, actually, but that is for the future. The app detects suspend/resume, because it has to run some code before the phone suspends and then after resume to fix the problem. If you exit the app, FPU will be disabled and "old/slow" calculations will be put back. But KMODE will still be enabled until you soft-reset.
Guest rumkokos Posted October 2, 2009 Report Posted October 2, 2009 Ow wow :) i read the first post from chainfire and i felt like reading something that someone translated from greek to chinese and than to english using 1st version of google translator :P I just wanted to say respect guys, i dont have Omnia2 yet but its definitly a good candidate in the future.. and I follow the development just because of sheer respect to you guys! So awsome to see you guys master the coding to extreme levels! Really RESPECT!! Figured u guys could use some positive feedback for your great job :D
Guest NuShrike Posted October 2, 2009 Report Posted October 2, 2009 On this topic, let's celebrate Acer's IDontCare stance on the Pentium FPU bug: http://www.thefreelibrary.com/Acer+America...ssue-a015978504
Guest Albertri Posted October 3, 2009 Report Posted October 3, 2009 On this topic, let's celebrate Acer's IDontCare stance on the Pentium FPU bug: http://www.thefreelibrary.com/Acer+America...ssue-a015978504 I think Acer is only consumer products w/c means FPU bug wont really do much a huge impact (non life, non business and non enviromental Threatening) :) hehe
Guest Yohng Posted October 3, 2009 Report Posted October 3, 2009 Ow wow :) i read the first post from chainfire and i felt like reading something that someone translated from greek to chinese and than to english using 1st version of google translator :P I'm really feeling same ! :D Is there a relation between this FPU Enabler and your research about OpenGl ? Is there a way from those two subjects to fixe 3D in SPB MS3.5 ? (Could you please, try to speak to dummies ?) :D
Guest daskalos Posted October 3, 2009 Report Posted October 3, 2009 I'm really feeling same ! :P Is there a relation between this FPU Enabler and your research about OpenGl ? Is there a way from those two subjects to fixe 3D in SPB MS3.5 ? (Could you please, try to speak to dummies ?) :D In my opinion, these guys will not waste their time and effort in developing something that will not benefit the devices in the end. The recent projects do have connections with each other, though I can't explain it in an expert's view. :) Try to read the open gl development thread thoroughly and you will realize that if it goes successful, not only there will be a fix on SPB 3.5's 3D but will also fix compatibility and performance issues with other 3D apps and Games. As I see it, they are trying their best to explain things in a way that is comprehensible to our non-developer minds, so it's up to us to do the effort to look up on some terms (through google or other means) that we may not understand. They are doing us all a great favor without asking anything in return... So it is we who have to adjust to things, not them :D
Guest rumkokos Posted October 3, 2009 Report Posted October 3, 2009 Yeah they are doing a great job indeed! However what is the point of having 3D in spb 3.5? I've seen it on HTC TP2 and its no use at all. If thats related to any other useful 3D app its ok but if its only for spb 3d carosel view its no point indeveloping that as its useless.. eye candy only :D I hope they will manage to raise the number of triangles per second so the phone is faster other than that, this phone is fenomenal. However if they manage to get the development of roms done in a way that with minimal effort they could all be transformed to fit OmniaPro.. i'd be in serious doubts which phone to buy again. Its a great phone and a lot of promising dev's are looking into it right now i guess so i think there is a bright future for this phone :) hope you guys are having fun developing :D :P
Guest Chainfire Posted October 3, 2009 Report Posted October 3, 2009 For the last few replies, see my post (in a few minutes) in the GL thread. FPU can indeed improve GL speed, as it uses a lot of float calculus, which is exactly what the FPU helps. How much difference real world? Don't know.
Guest blazingwolf Posted October 4, 2009 Report Posted October 4, 2009 Seems to be working on an F900. Sktools reports the following: With FPU enabled: Integer: 342.3579 Floating Point: 39.475 With FPU disabled: Integer: 341.8036 Floating Point: 7.444 Big difference but I don't see much in actual use right at the moment. I'm sure that will change though. :)
Guest Klusek Posted October 5, 2009 Report Posted October 5, 2009 Samsung Omnia Pro B7610 Before: Integer: 514,3381 Floating Point: 10,915 After: Integer: 512,8265 Floating Point: 58,243
Recommended Posts
Please sign in to comment
You will be able to leave a comment after signing in
Sign In Now