Please Log In or Register - it's FREE!

9 Pages V   1 2 3 > »   
Reply to this topicStart new topic
 FPU Enabler, v0.70, by Chainfire & NuShrike, proof-of-concept
Rating 5 V
Chainfire
post Oct 1 2009, 10:55
Post #1


Enthusiast
Group Icon

Group: Posters
Posts: 190
Joined: 1st October 2007
Member No.: 306,145



Several new devices have an FPU these days (VFP in this case), a coprocessor that can speed up floating point (fractioned numbers) calculations. Devices that have an FPU include devices based on the SC6410 processor like the Samsung Omnia II/Pro and Acer M900 and Snapdragon based devices like the Toshiba TG01 and HTC Leo. However, Windows Mobile right now does not come with support for it. NuShrike and myself decided to do something about this, and FPU Enabler is the result.

FPU Enabler is an application that patches coredll in-memory and replaces some of the FPU emulation routines with actual FPU routines, all applications will automatically make use of this.

Now, obviously there are a number of caveats with an application like this. First, the FPU code is not IEEE compliant. This means that in some edge cases calculation results are undefined, which may cause issues. Exceptions are not supported - so for example a divide by 0 will not raise an error, which may be problematic. If your device acts as the control board for a nuclear power plant, we would definitely advise against using this app.

Not nearly all instructions that can be speed-up by FPU use are supported (yet). They may be in the future. Actual real-world effects will depend heavily on the application used. You'd have to look hard to notice it, for most applications. This may change as (if) more instructions become supported. Some say Crayon Physics seems a bit snappier, though.

Because of the way this is patched in, you aren't actually reaching the full speed possible with hardware FPU. Also, devices not running in "ALLKMODE" already are patched to run this way. As we haven't really found a good way to patch context switching code yet, during FPU instructions interrupts are disabled and an extra jump into patch-code is required. Disabling interrupts requires KMODE, and thus we patch everything to run this way.

What we really need is for Microsoft, Samsung, Toshiba, HTC, etc to just simply enable FPU support in their kernel builds. That would solve pretty much all the issues, and be quite a bit faster even. We know CE6 supports it, and it is rumored WM7 will as well, but it would be great if they would put in support in new 6.5 builds smile.gif

Instructions
  • Unpack the zip file somewhere
  • Copy the EXE and the DLL to \ on your device
  • Run the EXE
  • Click Patch button
  • If the Patch app says so, close it and restart it, click Patch again.
  • Wait until "Done!"
  • Keep the EXE running. Closing the EXE will "unpatch" the FPU instructions again.

Credits
  • Chainfire - Patcher code
  • NuShrike - FPU code
Thanks
  • cmonex - Help with patch theory
  • no2chem - Help with patch theory
Compatibility notes
  • Samsung S3C6410:
  • Samsung Omnia II - main test device
  • Samsung Omnia Pro - untested
  • Acer M900 - tested and works
  • Acer F900 - untested
  • Acer X960 - untested
  • Qualcomm Snapdragon:
  • Toshiba TG01 - quick test done by cmonex
  • HTC Leo - untested
Note that Snapdragon users will have a lot less benefit than S3C6410 users, but it does work.

Patcher notes
  • DirtyBench is not really (or "really not") a reliable benchmark and does not benchmark all functions - hence the name.
  • Instructions are benchmarked and then selected for patching or not. S3C6410 users will likely see about 17 functions patched, and 13 functions unpatched (those are very simple functions). Most effect will be seen in MUL and DIV instructions.
Currently patched functions
  • __eqs
  • __ges
  • __gts
  • __les
  • __lts
  • __eqd
  • __ged
  • __gtd
  • __led
  • __ltd
  • __adds
  • __subs
  • __muls
  • __divs
  • __addd
  • __subd
  • __muld
  • __divd
  • __itos
  • __itod
  • __utos
  • __utod
  • __stoi
  • __stou
  • __stod
  • __dtoi
  • __dtou
  • __dtos
Other functions may follow in the future.

Known issues
  • None at the moment
Feel free to report issues, I'm not assuring you we will fix them, but they're interesting to know anyways.

Changelog 0.70
  • Seperate KMODE and COREDLL patches
  • Use FPUEnabler.dll instead of op_fpu.dll
  • Unpatch FPU calls on exit
  • Fixleak on exit
  • __eqs, __eqd, __ges, __ged, __gts, __gtd, __les, __led, __lts, __ltd, __utos, __utod, __stou, __dtou added
Remember this is proof-of-concept code, it may not actually be very useful, and may have adverse effects. It's "because we can" code. New tricks were learned!


This post has been edited by Chainfire: Oct 4 2009, 14:06
Attached File(s)
Attached File  FPUEnabler_0.70.zip ( 36.53K ) Number of downloads: 4534
 


--------------------
Author of many things ;)

My development blog: http://www.chainfire.eu/
Go to the top of the page
 
+Quote Post
atifarkas
post Oct 1 2009, 12:29
Post #2


Newbie
Group Icon

Group: Posters
Posts: 9
Joined: 17th September 2008
Member No.: 427,109

Device(s): omnia



I made some test with coreplayer benchmark. I didnt observe any speed increasing. There is any real demonstration what is show why is this good? smile.gif
Go to the top of the page
 
+Quote Post
Chainfire
post Oct 1 2009, 13:02
Post #3


Enthusiast
Group Icon

Group: Posters
Posts: 190
Joined: 1st October 2007
Member No.: 306,145



Various benchmarks we have run have a few % faster results. Probably mostly unnoticable to the naked eye. But that isn't the point - the point is that our devices can be faster with a very simple change to the kernel (from MS' viewpoint) - it's pretty much a 3-liner - and there isn't any good reason it isn't. True FPU support would see a much bigger difference than a patch like this can ever make come true. However also keep in mind that right now we only support a small number of instructions, there are many still missing. Before we can see real effects (even with patch) we'll have to implement more functions. And even then, it depends on what the bottleneck is in an app. We originally started on this for the GL 1.x layer, because it makes very heavy use of floats. How much floats CorePlayer uses, who knows.

Hence the:
QUOTE
Remember this is proof-of-concept code, it may not actually be very useful, and may have adverse effects. It's "because we can" code. New tricks were learned!


This post has been edited by Chainfire: Oct 1 2009, 13:03
Go to the top of the page
 
+Quote Post
GinKage
post Oct 1 2009, 15:57
Post #4


Enthusiast
Group Icon

Group: Posters
Posts: 225
Joined: 16th August 2009
Member No.: 576,397

Device(s): Samsung i8000 (Omnia II)



Seems that other side effects include crashing of Cube and Volume rocker popup after suspend/resume.
Go to the top of the page
 
+Quote Post
daskalos
post Oct 1 2009, 15:57
Post #5


Addict
Group Icon

Group: Posters
Posts: 947
Favorited Topics: 11
Joined: 8th October 2008
Member No.: 435,140

Device(s): Samsung Omnia i900
Twitter: @E1i077



Tried it with Acer M900

And it really pushes floating point to the roof, leaving others behind biggrin.gif ...



Great work guys. Great to see great developments laugh.gif

(hope the crashing issue in m900 can be easily fixed rolleyes.gif )


--------------------
Like my ROMs?

There are many ways to say thank you, one way is by clicking below










Go to the top of the page
 
+Quote Post
mechcool
post Oct 1 2009, 16:01
Post #6


Newbie
Group Icon

Group: Posters
Posts: 11
Joined: 8th September 2009
Member No.: 584,729

Device(s): Samsung Omnia 2



Million Thanks to Chainfire, NuShrike, cmonex, no2chem. Your contributions are greatly appreciated. Though I might not be using this anytime soon. But I sure am happy to know that there are people spending their precious time on enhancements like this.

One again, Thank you. Greatly appreciate all the hardwork you guys have put in.... :-)


This post has been edited by mechcool: Oct 1 2009, 16:06
Go to the top of the page
 
+Quote Post
NuShrike
post Oct 1 2009, 16:25
Post #7


Enthusiast
Group Icon

Group: Posters
Posts: 287
Joined: 3rd August 2007
Member No.: 284,804

Device(s): Palm T|X, T-Mobile HD2





--------------------
KaiserSimFix: soft-reset-safe sim contacts hiding
CamerAware Buddy | HTCClassAction.org | KaiserGL SDK | LevelSight | FusionGPSFix
Support what I do and buy me a drink.
Go to the top of the page
 
+Quote Post
Chainfire
post Oct 1 2009, 16:28
Post #8


Enthusiast
Group Icon

Group: Posters
Posts: 190
Joined: 1st October 2007
Member No.: 306,145



GinKage I have been able to replicate rocker/crash only once.. can you elaborate exactly what happens? Both the cube and screen lock crash I cannot replicate either on my device.

Nice stats daskalos smile.gif We're still looking into a fix for the M900 specific issue... it may actually solve these crash issues as well.
Go to the top of the page
 
+Quote Post
Chainfire
post Oct 1 2009, 17:13
Post #9


Enthusiast
Group Icon

Group: Posters
Posts: 190
Joined: 1st October 2007
Member No.: 306,145



Oh and... all problems right now are suspend/resume problems, correct?
Go to the top of the page
 
+Quote Post
NuShrike
post Oct 1 2009, 22:47
Post #10


Enthusiast
Group Icon

Group: Posters
Posts: 287
Joined: 3rd August 2007
Member No.: 284,804

Device(s): Palm T|X, T-Mobile HD2



QUOTE(atifarkas @ Oct 1 2009, 05:29) *
I made some test with coreplayer benchmark. I didnt observe any speed increasing. There is any real demonstration what is show why is this good? smile.gif
Because most good programs have learned to avoid SLOW floating-point math, you probably won't see much difference in coreplayer or other high-performance programs.

However, programs such as 3D can't avoid needing sub-precision numbers or requires complex math (matrices, sqrt), this is where the big boost will be. An example is the Qt4 Framework whose advanced graphics paths are entirely floating-point based so the Windows Mobile port will be much faster now.


This post has been edited by NuShrike: Oct 1 2009, 23:02
Go to the top of the page
 
+Quote Post
Chainfire
post Oct 2 2009, 01:54
Post #11


Enthusiast
Group Icon

Group: Posters
Posts: 190
Joined: 1st October 2007
Member No.: 306,145



Alright! It certainly looks like we have located and fixed the suspend/resume problem apparent in some programs on the Omnia II as well as the M900 in general. Test version works, now we just need to fix it up for release (and have you guys test it). GinKage, hope to see you on IRC tomorrow so you can do some last tests as well.

For now, nap time!
Go to the top of the page
 
+Quote Post
NuShrike
post Oct 2 2009, 02:20
Post #12


Enthusiast
Group Icon

Group: Posters
Posts: 287
Joined: 3rd August 2007
Member No.: 284,804

Device(s): Palm T|X, T-Mobile HD2



QUOTE(Chainfire @ Oct 1 2009, 18:54) *
Omnia II as well as the M900 in general.
This includes any device running the S3C6410 which pretty much includes all the Samsung-cpu based Acers recently such as F900, X960, etc etc.
Go to the top of the page
 
+Quote Post
Albertri
post Oct 2 2009, 04:34
Post #13


Regular
Group Icon

Group: Posters
Posts: 119
Joined: 13th August 2009
From: Singapore
Member No.: 575,231

Device(s): GT-i8000, E63, ZN5



Really hands down to all you guys doing all these enhancement and discovering hidden potential in our devices!.

Keep up the good work if you guys need some tester just let us know I will be one of them queuing up. laugh.gif

One quick questions will this patch benefit the 3D games? rightnow I have Ferrari GT Evolution installed in my O2

Go to the top of the page
 
+Quote Post
Chainfire
post Oct 2 2009, 11:43
Post #14


Enthusiast
Group Icon

Group: Posters
Posts: 190
Joined: 1st October 2007
Member No.: 306,145



v0.70 is out... crash problem should be fixed!
Go to the top of the page
 
+Quote Post
daskalos
post Oct 2 2009, 13:33
Post #15


Addict
Group Icon

Group: Posters
Posts: 947
Favorited Topics: 11
Joined: 8th October 2008
Member No.: 435,140

Device(s): Samsung Omnia i900
Twitter: @E1i077



A fix already? ohmy.gif how fast, I thought releasing the fix will take about another day or two tongue.gif

Testing and running it right now, yup crashing from suspend/resume is fixed biggrin.gif , just will keep this running for a while to monitor things...

I notice that upon suspend/resume, the app suspends and resumes too...'Bout closing this, does it still need soft reset or exits by just tapping "ok"?
Go to the top of the page
 
+Quote Post
Chainfire
post Oct 2 2009, 14:25
Post #16


Enthusiast
Group Icon

Group: Posters
Posts: 190
Joined: 1st October 2007
Member No.: 306,145



Well the fix could have been better, actually, but that is for the future. The app detects suspend/resume, because it has to run some code before the phone suspends and then after resume to fix the problem. If you exit the app, FPU will be disabled and "old/slow" calculations will be put back. But KMODE will still be enabled until you soft-reset.
Go to the top of the page
 
+Quote Post
rumkokos
post Oct 2 2009, 19:35
Post #17


Newbie
Group Icon

Group: Posters
Posts: 28
Joined: 21st February 2009
Member No.: 501,055

Device(s): Samsung Omnia



Ow wow smile.gif i read the first post from chainfire and i felt like reading something that someone translated from greek to chinese and than to english using 1st version of google translator biggrin.gif

I just wanted to say respect guys, i dont have Omnia2 yet but its definitly a good candidate in the future.. and I follow the development just because of sheer respect to you guys! So awsome to see you guys master the coding to extreme levels!

Really RESPECT!!


Figured u guys could use some positive feedback for your great job wink.gif
Go to the top of the page
 
+Quote Post
NuShrike
post Oct 2 2009, 23:48
Post #18


Enthusiast
Group Icon

Group: Posters
Posts: 287
Joined: 3rd August 2007
Member No.: 284,804

Device(s): Palm T|X, T-Mobile HD2



On this topic, let's celebrate Acer's IDontCare stance on the Pentium FPU bug:
http://www.thefreelibrary.com/Acer+America...ssue-a015978504
Go to the top of the page
 
+Quote Post
Albertri
post Oct 3 2009, 02:04
Post #19


Regular
Group Icon

Group: Posters
Posts: 119
Joined: 13th August 2009
From: Singapore
Member No.: 575,231

Device(s): GT-i8000, E63, ZN5



QUOTE(NuShrike @ Oct 3 2009, 07:48) *
On this topic, let's celebrate Acer's IDontCare stance on the Pentium FPU bug:
http://www.thefreelibrary.com/Acer+America...ssue-a015978504


I think Acer is only consumer products w/c means FPU bug wont really do much a huge impact (non life, non business and non enviromental Threatening) tongue.gif hehe
Go to the top of the page
 
+Quote Post
Yohng
post Oct 3 2009, 04:50
Post #20


Enthusiast
Group Icon

Group: Posters
Posts: 195
Joined: 26th October 2008
From: Guangzhou
Member No.: 441,819

Device(s): I8000NXXJ...



QUOTE(rumkokos @ Oct 3 2009, 03:35) *
Ow wow smile.gif i read the first post from chainfire and i felt like reading something that someone translated from greek to chinese and than to english using 1st version of google translator biggrin.gif

I'm really feeling same ! ohmy.gif

Is there a relation between this FPU Enabler and your research about OpenGl ?
Is there a way from those two subjects to fixe 3D in SPB MS3.5 ?

(Could you please, try to speak to dummies ?) biggrin.gif


--------------------
The best french forum : PDAphoneAddict.com
Go to the top of the page
 
+Quote Post

9 Pages V   1 2 3 > » 
Reply to this topicStart new topic
3 User(s) are reading this topic (3 Guests and 0 Anonymous Users)
0 Members:

 


RSS hit counter Lo-Fi Version Time is now: 1st August 2010 - 10:44

Please visit our 'Plus Partners' - these companies support MoDaCo through 'MoDaCo Plus' - Click Here for more details!

ActiveKitten | Binaryfish | Conduits | DeveloperOne | eSoft Interactive | FTouchSL | Inesoft | LastPass

Lingvosoft | monocube | OmegaOne | Omnisoft | Opera Software | Resco | SBSH | Splashdata

Sprite Software | Syncdata | Teksoft | Titanium Backup | VITO | WalkingHotSpot | WebIS | z4soft

Would your company like to become a 'Plus Partner'? Click Here to contact us!