Jump to content

Offline version of Wikipedia?


Guest DJ_Enigma

Recommended Posts

Guest DJ_Enigma

I've seen this mentioned a couple of times on this forum but don't know how to get it.

I think paul said it is something like a 90meg download, but I'd very much like to get it.

Has anyone got a link?

Cheers,

DJ

Link to comment
Share on other sites

Guest Swampie

Do you mean this:

http://news.bbc.co.uk/1/hi/technology/6566749.stm

"Nearly 2,000 Wikipedia articles will be sold on compact disc to give people without a net connection access to highlights of the popular web resource.

The Wikipedia Version 0.5 CD collection includes topics such as geography, arts, literature, science and history.

The articles were selected by software that rated their quality and importance to the Wikipedia community.

Martin Walker, a senior academic from the US also helped set the selection criteria for the $13.99 (£7) disc..."

Link to comment
Share on other sites

Guest Menneisyys

You may want to read my related Wiki Bible ;) I copy it here:

WikiPedia has become by far the best source of really up-to-date information – in many respects, even better than the online Oxford English Dictionary (see “The Definitive Roundup of All Pocket PC Dictionaries Part II – non-WordNet-based English Dictionaries” for more info if interested on the well-known, alternate sources of available information.)

In this roundup, I elaborate on how you can access information in WikiPedia on your Pocket PC other than directly browsing its pages in your Pocket PC Web browser.

Note that this article contains a LOT of never-before-published tricks and tips. Did you know, for example, that you can save a lot of money / greatly speed up your mobile phone-based online Wiki access by using data compression? Did you know that, for example, one of the offline Wiki databases, Lexipedia, offers excellent fuzzy searching capabilities, which make it possible to find keywords whose spelling you’re not really sure of? The list continues – trust me, even seasoned Wiki users will learn a lot of tricks and tips from this article.

Compressed / optimized online access to drive down communication costs/loading times

Unfortunately, Wiki (still?) doesn’t have a WAP or a PDA-optimized “official” interface that would return the content at least GZIP-compressed (please do read this article and the referenced articles to understand what this means) to heavily drive down the communication costs. Therefore, you need to rely on online proxies or repositories.

This article is probably the best to elaborate on these. It’s a bit outdated (for example, the Wiki proxy at 3g.co.nz doesn’t work any more) and lacks a lot of very important information.

The related comparison chart is here (sorry, couldn’t put it in here). Note that Pocket Internet Explorer / Internet Explorer Mobile in WM2003+ supports WAP (also see this article for more info) if you prefer accessing Wiki through WAP.

To summarize the chart's contents, you may want to give a try to pda.Language Code.wapedia.org but it’s not much better than a generic Web compression service (elaborated on here).

Offline access

There are three WikiPedia databases / applications on the Pocket PC that make it possible to have a local Wiki copy and browse it locally, without any kind of Internet access.

MDict

MDictWikiPPC.bmp.png

The database itself, MDict, free and, because of being an excellent dictionary / database application on the Pocket PC (please see this roundup of Pocket PC English dictionaries for more info on MDict itself), highly recommended. As is also pointed out in the article, there are some free databases for MDict, which also add to its usability – in this screenshot , it has for example the WordNet and the OPTED dictionary databases, among others.

However, the MDict Wiki database itself is very outdated (late 2003 – see, for example, the original PPC article here to see how outdated it is) and must be rebuilt in order to be usable. That is, while MDict is a free and decent engine, as its Wiki database is three years old, it’s not recommended.

TomeRaider 3 with Erik Zachte’s databases with TomeRaider3

TRWikiPPC.bmp.png

This is by far the best engine and database – it supports HTML and has a considerably newer database than the other two alternates.

Its main disadvantage, compared to the most important alternative, Lexipedia, is the complete lack of fuzzy search and in-word (substring) searching, as far as searching its index is concerned. Also, it has no way of invoking (or even getting the contents of) external links, unlike the two alternates.

However, in all other respects it’s far superior to Lexipedia. For example, it has a freely scrollable index, which is a definite advantage. It is also able to make in-text searching. While it may be a bit slow, it’s unique: the alternate apps don’t offer this capability.

Lexipedia 2.4 by Revolutionary Software Front

LexipediaPPC.bmp.png

It has some real advantages over TomeRaider: it has a quickly searchable index, allows for fuzzy (error correction) searches (just like Lextionary 2.4 by the same developer) and has clickable external links.

It, however, has major drawbacks. It (as with Lextionary) doesn’t allow for in-description search (unlike TomeRaider) and it renders the descriptions as plain text, without any HTML markup. This not only means the complete lack of even basic things like bold and italic markup (which would be essential to emphasize for example newly introduced stuff), but also the complete lack of any info boxes (which are widely available in a lot of Wiki articles – even in the “old”, one-year-old versions) and their contents.

Unfortunately, the complete lack of HTML support seems to be common with all Revolutionary Software Front products. Given that it’s not trivial to implement (unless the developer switches entirely to using the Pocket Internet Explorer plug-in to render the HTML content) I don’t think it’ll be implemented in the near future.

Also, the database reflects the Wiki state as of early August 2005 and can not be manually updated (the database format is proprietary and closed, as opposed to that of TomeRaider, which has freely available tools for database creation). In this respect, TomeRaider, reflecting the late December 2005 database state, is definitely better. And, again, don’t forget that you can create a newer, updated version of the TomeRaider database any time, unlike with Lexipedia, where you are at the developer’s mercy.

Please also see this full review on the engine (TomeRaider3) itself.

Comparison chart – offline solutions

The comparison chart is here. Note that it’s entirely different from the first comparison chart showing the online, Wiki-related compression / WAP/PDA-conversion services.

Verdict

Apart from the inability to (quickly) search for substrings in the titles, to make fuzzy searches (which are unique features of Lexipedia) and to click external links, go for the TomeRaider version. The openness of the format, the full HTML support, including info boxes, the ability to search for anything in-text, the fact that you get not only Wiki for the price of TomeRaider but also the ability to read any other TomeRaider documents are all big advantages of the TomeRaider version. (If you can and will pay more than twice the price of Lexipedia, that is.)

Lexipedia, while it has definite advantages over TomeRaider (fuzzy search, clickable links), due to the complete lack of HTML markup (including infoboxes) is bound to hide relevant information and is based on a slightly older database, which will be even more acute a problem as time goes by and the TomeRaider version is updated (I also plan recompiling and updating the TomeRaider version), if (and only if) the Lexipedia database remains not updated.

Therefore, I recommend checking out the TomeRaider version first and Lexipedia only as second (if at all).

Unfortunately, I can not recommend the MDict Wiki version at all. It’s really outdated. Its database MUST be updated to become usable.

Links

In addition to the links to my other articles, I also recommend this FirstLoox thread on the technical limitations of other e-book formats; for example, Mobipocket Reader (note that there are still Mobi conversion; for example, the pretty new (14.4.2006) German one here (thanks for Felix for the tip!) and for example this AximSite thread on the copyright issues of the TR database.

Note that since the article “The whole world in your pocket (almost)" by Shaun McGill (which you may also want to read) has been written, the TR databases have been relocated from here to here.

UPDATE (08/29/2006): PPCT frontpage

UPDATE (01/30/2007): In October 2006, a brand new dump has been posted. The text-only version, however, requires at least a 2GByte card, as opposed to the previous dump. Also, the LexiPedia database has been updated; now, the new database is dated 12/29/2006.

UPDATE (02/04/2007): I've thoroughly examined the database date of both new versions to find out which one is newer. Lexipedia is dated back to 09/25/2006 (while the release date is, as has already been pointed out, late December 2006) and is, therefore, more than a year more up-to-date than the previous version - and still fits in a 1Gbyte card.

The, technically, superior TomeRaider version, on the other hand, is definitely a disappointment this time: it’s more than a year old (don’t pay attention to the filedates, which “only” date back to October 2006): the database itself reflects the state of 12/12/2005. In this respect, it is in no way better / newer than the old version. The TomeRaider folks should, at last, release a more up-to-date database to remain competitive. All in all, you’ll want to prefer the Lexipedia version if you want the latest Wiki state.

(In the meantime, there haven’t been new Wiki database versions for MDict. That is, it should be entirely forgotten as an alternative.)

Cross-posted to: PPCT, MobilitySite, AximSite, XDA-Developers, BrightHand

Link to comment
Share on other sites

  • 3 weeks later...

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.