Jump to content

Advent Vega kernel source code now available!


Guest PaulOBrien

Recommended Posts

I also think there is no need to unsquash things... even /system/lib contents can be overriden by /system/vendor/lib ... So, no real need to build a nosquash system... And if you REALLY need RW access, there is a way using an overlay filesystem over the squashfs ... Google it, you will find it. But, it should be used just for development, and not as a daily feature, as, as Cass said, by default, device roms have /system mounted as RO.

Speaking of more interesting things, i have been looking at the Wifi errors while the tablet goes to suspend. This seems to be a bug of the Atheros driver, that is not honoring the suspend request, and keeps posting commands to the wifi card even after telling linux it is suspended... The sdhci stack has no provisions to deal with that, and tries to honor the commands by sending them... But the sdhci controller IS suspended, so it can't and it ends giving the errors we see...

Will have to study the Atheros driver a bit...

Wonderful a sleepwalking wifi device, does not get any better than that :)

Link to comment
Share on other sites

Guest ejtagle

Wonderful a sleepwalking wifi device, does not get any better than that :)

Well, ran out of time today, but, seems i have found the main problem with wifi

1) To make HostAP work, we wil need to patch the hostapd daemon, as the method to force ar6002 into AccessPoint mode radically differs from the one used by the bcm43xx ... I already did the required patch for VC, so it should be nearly trivial to port to ICS...

2) There are 2 problems with wifi... First, the suspend problem ... Seems the Atheros driver, to suspend the device, just unpowers it. No problems there. But, when the system resumes, the SDHCI bus driver tries to inmediately id the Atheros card, and that id procedure fails, as is not giving enough time for the card to properly initialize. I did some kernel changes to the SDHCI platform data that should give 2.5 seconds begore trying to detect the card when resuming... I think it should solve the problem.

The other problem with wifi (not being able to scan networks) is just a race condition ... The AR6002 driver is loaded, and inmediately, the wlan0 interfase is tried to be bring up. But that is impossible, as the interface is not ready as the AR6002 is still loading its firmware and initializing... So, the fix probably will be just to add a 2 second delay between the module loading and the try to bring the interface up. I still have to find the file in AOSP where this change has to be done..

Attached the SDHCI platform kernel patch... It compiles, but did not have enough time to test it yet

Regards

Eduardo

rel-14r7-shuttle support-trial.rar

Link to comment
Share on other sites

Well, ran out of time today, but, seems i have found the main problem with wifi

1) To make HostAP work, we wil need to patch the hostapd daemon, as the method to force ar6002 into AccessPoint mode radically differs from the one used by the bcm43xx ... I already did the required patch for VC, so it should be nearly trivial to port to ICS...

2) There are 2 problems with wifi... First, the suspend problem ... Seems the Atheros driver, to suspend the device, just unpowers it. No problems there. But, when the system resumes, the SDHCI bus driver tries to inmediately id the Atheros card, and that id procedure fails, as is not giving enough time for the card to properly initialize. I did some kernel changes to the SDHCI platform data that should give 2.5 seconds begore trying to detect the card when resuming... I think it should solve the problem.

The other problem with wifi (not being able to scan networks) is just a race condition ... The AR6002 driver is loaded, and inmediately, the wlan0 interfase is tried to be bring up. But that is impossible, as the interface is not ready as the AR6002 is still loading its firmware and initializing... So, the fix probably will be just to add a 2 second delay between the module loading and the try to bring the interface up. I still have to find the file in AOSP where this change has to be done..

Attached the SDHCI platform kernel patch... It compiles, but did not have enough time to test it yet

Regards

Eduardo

Compiles ok but seems not to fix the crash issue, it may resolve a crash resuming from sleep but im not seeing that right now, my test case is wandering around in tapatalk ...

http://pastebin.com/6xwGJGWi

Same as logs before ... i have compiled in the qtguid stuff but as suspected this was not the route cause ...

Its curious though why we are getting the wfi messages in the panic when nothing is supposed to be sleeping .. im using the thing at this point ;)

Cheers

Cass

Link to comment
Share on other sites

Cass, Eduardo,

I am not an expert on kernel development, but could this be related to some usb controller conflict / sdcard. I seem to bed getting reboots when accessing sdcard. Like reading a pdf file in ezPDF, or linking/unlinking apps with link2sd. Seems to me that has not much to do with wifi.

Verstuurd van mijn HTC Desire Z met Tapatalk

Link to comment
Share on other sites

Guest ejtagle

Compiles ok but seems not to fix the crash issue, it may resolve a crash resuming from sleep but im not seeing that right now, my test case is wandering around in tapatalk ...

http://pastebin.com/6xwGJGWi

Same as logs before ... i have compiled in the qtguid stuff but as suspected this was not the route cause ...

Its curious though why we are getting the wfi messages in the panic when nothing is supposed to be sleeping .. im using the thing at this point ;)

Cheers

Cass

Since yesterday i know the crash problems are caused by:

[ 2337.805586] Unhandled fault: imprecise external abort (0x1c06) at 0x50ac1f28

This error is an unrecoverable fault detected by the ARM processor.. Causes are always external to the core. I found that there was a reference to this issue in nvidia kernel git ...The exception is said to be imprecise, as the exact RAM location can't be determined, and that is because this in an external assertion. The problem could be caused by a cache incoherency (i would try to enable l3x0 cache workarounds as a first measure... or perhaps at some point the kernel is accessing an undefined address..

The problem seems to be caused at random... It would be interesting to know if all shuttle variants are affected by this, or if it is restricted to some specific models... (ViewSonic/Pov/Advent/Shuttle) ...

Edited by ejtagle
Link to comment
Share on other sites

Eduardo,

Maybe this is one fix you are thinking about but errata 751472 is set to yes in the default tegra android config and the adam devs have also set to yes:- errata 751472 info (its not set in your latest config update) ?

I'm sure that's set In the boardconfig in android.. ill set in kernel later to see.

Link to comment
Share on other sites

Guest ejtagle

Sleep works fine with the .built-in flag

The built-in flag does the trick, but it is not the optimum ... as it keeps the WiFi card powered ... I will try to get a proper fix for this... Seems just as a initialization problem when the device resumes sleep ... ;)

Link to comment
Share on other sites

The built-in flag does the trick, but it is not the optimum ... as it keeps the WiFi card powered ... I will try to get a proper fix for this... Seems just as a initialization problem when the device resumes sleep ... ;)

CONFIG_ARM_ERRATA_751472=y

Added and same issue ...

http://pastebin.com/ke2Mb4Nc

Link to comment
Share on other sites

Guest ejtagle

CONFIG_ARM_ERRATA_751472=y

Added and same issue ...

http://pastebin.com/ke2Mb4Nc

Well, this will take time... Seems to be random ... I

  • [ 284.671913] bssid 00:3a:99:d2:32:51
  • [ 338.354053] Unhandled fault: imprecise external abort (0x1c06) at 0x000221a8


    ----

    • [ 1847.112177] bssid 00:1b:2f:d7:79:f2
    • [ 1906.276284] Unhandled fault: imprecise external abort (0x1c06) at 0x5365aa24

    ----

    [*][ 304.038276] bssid 02:1a:11:f3:40:14[*][ 319.620019] Unhandled fault: imprecise external abort (0x1c06) at 0xdfd73205[*]

    <br class="Apple-interchange-newline">

    ----

    There are always an "idle" time between the last kernel event and the crash... As you know, imprecise exceptions have no exact place or loxation in RAM. This could be caused by an incoherency between the external cache and the main SDRAM. I have been thinking on several things that could cause it:

    1) A hardware bug (but the errata is enabled)

    2) A Tegra2 external peripheral ... It comes into my mind that we are skipping the SPI initialization to save DMA channels ... Something we didn't do on previous kernels... maybe we could re enable some of them

    3) A power supply glitch, that is corrupting SDRAM contents. Power glitches could happen when the cpufreq governor kicks in and tries to lower the CPU frequency. At the same time, the CPU voltage is lowered. There is always more than 10 seconds of idling... So this could be the cause

    4) Something that is overclocked... I don't think this could be the problem...

    5) Some bug in the clocking scheme used

    I think this problem did not happen with the previous 2.6.39 kernel... By looking at the logs a 2nd time, seems to be related to idling... Well, tons of things to check ;)

    BTW, i am still also working on RIL and Wifi :o

Link to comment
Share on other sites

Well, this will take time... Seems to be random ... I

  • [ 284.671913] bssid 00:3a:99:d2:32:51
  • [ 338.354053] Unhandled fault: imprecise external abort (0x1c06) at 0x000221a8


    ----

    • [ 1847.112177] bssid 00:1b:2f:d7:79:f2
    • [ 1906.276284] Unhandled fault: imprecise external abort (0x1c06) at 0x5365aa24

    ----

    [*][ 304.038276] bssid 02:1a:11:f3:40:14[*][ 319.620019] Unhandled fault: imprecise external abort (0x1c06) at 0xdfd73205[*]

    <br class="Apple-interchange-newline">

    ----

    There are always an "idle" time between the last kernel event and the crash... As you know, imprecise exceptions have no exact place or loxation in RAM. This could be caused by an incoherency between the external cache and the main SDRAM. I have been thinking on several things that could cause it:

    1) A hardware bug (but the errata is enabled)

    2) A Tegra2 external peripheral ... It comes into my mind that we are skipping the SPI initialization to save DMA channels ... Something we didn't do on previous kernels... maybe we could re enable some of them

    3) A power supply glitch, that is corrupting SDRAM contents. Power glitches could happen when the cpufreq governor kicks in and tries to lower the CPU frequency. At the same time, the CPU voltage is lowered. There is always more than 10 seconds of idling... So this could be the cause

    4) Something that is overclocked... I don't think this could be the problem...

    5) Some bug in the clocking scheme used

    I think this problem did not happen with the previous 2.6.39 kernel... By looking at the logs a 2nd time, seems to be related to idling... Well, tons of things to check ;)

    BTW, i am still also working on RIL and Wifi :o

    Ok, i changed gov to ondemand and ill remove cpu_idle if i can .. see if those make a diff

Link to comment
Share on other sites

Guest brucelee666

Ok, i changed gov to ondemand and ill remove cpu_idle if i can .. see if those make a diff

Thought I would mention this line added to the new boot image not previously in alpha2 or in the adam boot image, as your now looking at power management and this was added from the new ventana stuff:-

"write /sys/module/cpuidle/parameters/lp2_in_idle 1" Maybe test removing this if other tests fail or its probably fine and okay to be left, maybe ?

edit:- Should have said its in init.harmony.rc file

Edited by brucelee666
Link to comment
Share on other sites

Thought I would mention this line added to the new boot image not previously in alpha2 or in the adam boot image, as your now looking at power management and this was added from the new ventana stuff:-

"write /sys/module/cpuidle/parameters/lp2_in_idle 1" Maybe test removing this if other tests fail or its probably fine and okay to be left, maybe ?

edit:- Should have said its in init.harmony.rc file

Hmm nice ... never noticed that ...

write /sys/module/cpuidle/parameters/lp2_in_idle 1

worth a shot eh :)

Im currently testing no cpu idle kernel

yeah interestingly im unable to force a reboot in my normal way using a kernel with

# CONFIG_CPU_IDLE is not set

Ill compile it back in and try the sysfs entry..

Adam guys dont have that line either :)

edit2 -- brucelee666 .. you may just have found the solution mate .... im listening to spotify and using tapatalk at same time by frantically rubbing my finger up and down the screen to cause a crash and its not doing it ... half a song on spotify tipped it over the edge for me last night ... and tapatalk .. pff thats my test case as it does it so often on demand ,.. not with this sysfs entry removed so far .. normally id have killed it by now ;)

testing resumes but could be it ... fingers crossed :)

Cheers

Cass

Edited by Cass67
Link to comment
Share on other sites

Guest ejtagle

Hmm nice ... never noticed that ...

write /sys/module/cpuidle/parameters/lp2_in_idle 1

worth a shot eh :)

Im currently testing no cpu idle kernel

yeah interestingly im unable to force a reboot in my normal way using a kernel with

# CONFIG_CPU_IDLE is not set

Ill compile it back in and try the sysfs entry..

Adam guys dont have that line either :)

edit2 -- brucelee666 .. you may just have found the solution mate .... im listening to spotify and using tapatalk at same time by frantically rubbing my finger up and down the screen to cause a crash and its not doing it ... half a song on spotify tipped it over the edge for me last night ... and tapatalk .. pff thats my test case as it does it so often on demand ,.. not with this sysfs entry removed so far .. normally id have killed it by now ;)

testing resumes but could be it ... fingers crossed :)

Cheers

Cass

Of course, it is important to try not do disable idling.. as it translates into more power consumed ... So, it is important to be sure we disable what is needed, but no more than that ... This idling thing gives me the idea that we are in the presence of a power glitch ... I will also check the tps6586x driver, as there was something related to time to power stabilization ;)

Link to comment
Share on other sites

Of course, it is important to try not do disable idling.. as it translates into more power consumed ... So, it is important to be sure we disable what is needed, but no more than that ... This idling thing gives me the idea that we are in the presence of a power glitch ... I will also check the tps6586x driver, as there was something related to time to power stabilization

Yes, idling is enabled now but the lp2 enable in idle is now 0.. seems ok

Link to comment
Share on other sites

Guest ejtagle

Yes, idling is enabled now but the lp2 enable in idle is now 0.. seems ok

The main difference seems to be that idling can be done in lp3 or lp2... If lp2 is disabled, lp3 will be used .. Don't know hte power consumption difference or if we will notice it or not :)

Link to comment
Share on other sites

The main difference seems to be that idling can be done in lp3 or lp2... If lp2 is disabled, lp3 will be used .. Don't know hte power consumption difference or if we will notice it or not

Just charging it now.. its been off power for 17hrs a lot of it sleep.. tomorrow ill probably have a better idea if I have data for that ;)

Link to comment
Share on other sites

Guest brucelee666

Cass,

Hopefully your further tests have not resulted in any reboots and removal of this line or at least setting value to zero/N has fixed the problem.

Was not on a computer but added line to postboot to set value and in 3-4 hours use never had a reboot so looked good.

Link to comment
Share on other sites

Cass,

Hopefully your further tests have not resulted in any reboots and removal of this line or at least setting value to zero/N has fixed the problem.

Was not on a computer but added line to postboot to set value and in 3-4 hours use never had a reboot so looked good.

Yep nice one .. no reboots here ... just packaging an update for the users ... if anyone will find a reboot in there they will ;)

EDIT - Now the random reboots are seemingly resolved for the time being anyway .. ive looked at the nvrm crashes we are seeing in games, the crash logs i posted a day or so ago indicated that we had a ram exhaustion issue in effect .. i just upped the GPU ram to 256 and my test game "Switch", that crashed and dumped me to the desktop every time i hit something, now does not crash, just found out when i hit something it goes to a cutscene to show me the crash :) ... 256 for gpu is not optimal for us as you will know but it gives me a starting point to work back from ... not all doom and gloom it appears ;) Pity we dont have a gig in there so we could just up it and go ... meh .. not fun in that is there ;) ..

Could be we have to look more closly at the app tuning side now if we have to really up the ram to anywhere near that ... i guess /d will tell me what i need to look at ...

Cheers

Cass

Edited by Cass67
Link to comment
Share on other sites

Guest brucelee666

Cass,

Yes noticed you have now released will wait and see now if users find it fixed, last thing on this for info in "/d/cpuidle/lp2" with the parameter set to yes this file gets updated with times and percentages but with parameter set to no nothing gets updated.

Only mention incase this has to be looked at again for battery life or something, also in board-shuttle "tegra_suspend_platform_data shuttle_suspend" cpu_off_timer is set to zero looking at other boards this is usually 200 ?

Re. GPU mem on honeyice I think they had it set to 192mb

Edited by brucelee666
Link to comment
Share on other sites

Cass,

Yes noticed you have now released will wait and see now if users find it fixed, last thing on this for info in "/d/cpuidle/lp2" with the parameter set to yes this file gets updated with times and percentages but with parameter set to no nothing gets updated.

Only mention incase this has to be looked at again for battery life or something, aslo in board-shuttle "tegra_suspend_platform_data shuttle_suspend" cpu_off_timer is set to zero looking at other boards this is usually 200 ?

Re. GPU mem on honeyice I think they had it set to 192mb

Cool cheers .. guess i have to find some HI users to see how they found interactive performance with that value wooshy1 would know i suspect how he finds it. With 256 app load times are a bit slow... not unexpected ;)

Wrt to the lp2 debug entry, id expect nothing to show if we unset the value, id like to see battery performance before we care about that again... im happy its "fixed"

Eduardo can speak to why we use the cpu_off_timer value as ive no idea ...

Edited by Cass67
Link to comment
Share on other sites

Guest brucelee666

Cass,

Just downloaded free ver of switch started and played first time no problems not crashed, went to apps launched fruit ninja played a game, went back to switch played again still no crash.

My gpu mem is same as alpha3 "settings/apps/running" shows 184mb free ram 162mb used - possibly you have other things running ?

Link to comment
Share on other sites

Cass,

Just downloaded free ver of switch started and played first time no problems not crashed, went to apps launched fruit ninja played a game, went back to switch played again still no crash.

My gpu mem is same as alpha3 "settings/apps/running" shows 184mb free ram 162mb used - possibly you have other things running ?

Plume and hackers KB .. the other 3 are settings, media and google services ... i dont run much ;)

You see the cutscene when you crash ?

EDIT :- 160 seems ok too .. need more test cases before id say im happy with 160 :)

interactive performance is waay better too .. i could live with 160 i think

Edited by Cass67
Link to comment
Share on other sites

Guest brucelee666

Plume and hackers KB .. the other 3 are settings, media and google services ... i dont run much ;)

You see the cutscene when you crash ?

Been getting different things after playing a few times from game plays fine get the crash scene then resume/main menu options to play then crash then screen starts flickering (game freeze) press home button then select game from previous apps menu and game back to main menu.

The NvrmChannelSubmt failed (error=196623, SyncPointValue=5688750 message happens when this occurs think I seen that in someones gta log.

Link to comment
Share on other sites

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.