Notices


Reply
Thread Tools
Posts: 156 | Thanked: 44 times | Joined on Dec 2007
#1
There are currently several existing Wikipedia readers for the N900. However, many of them don't contain full articles, the full database, or otherwise have significant shortcomings.

I'm currently working on something based around this project: http://users.softlab.ece.ntua.gr/~tt...iaOffline.html.

There is a disadvantage with this method though. The articles.xml file that this uses is ~6GB, and the index that it builds is another 3. However, in the modern times we live in, the N900 has 32GB of storage, and my 16GB card cost about $50 AUD.

An advantage, though, will be that the end user is more easily able to create their own updated dumps.

I plan on replacing the PHP Mediawiki parser in that project with a faster Python one, as well as some Maemo specific tweaking.

It might seem overboard to have all of Wikipedia in your pocket - but so was the idea of having a Linux machine in your pocket when the 770 first came out

Any comments/suggestions/questions?
 

The Following 7 Users Say Thank You to t3h For This Useful Post:
pantera1989's Avatar
Posts: 577 | Thanked: 699 times | Joined on Feb 2010 @ Malta
#2
I don't know..6 GB is quite a lot. Between movies, PSX games mp3 and maybe sygic maps, 32GB becomes 5GB.

And how will this be updated if the page is updated? Will it look through all the pages before updating? The N900 is an internet device. Offline viewing does kinda destroy the point.

I would suggest that you create an app that is able to download the articles one wants for offline viewing. This way there won't be a lot of space taken up.
 

The Following 2 Users Say Thank You to pantera1989 For This Useful Post:
Posts: 73 | Thanked: 18 times | Joined on Feb 2010
#3
offline wikipedia is indeed a killer application even on internet devices like the N900. What do you do when you are abroad without a data plan?

A couple of questions: I guess images are not really part of the installation?
what would be the realy advantage compared to evopedia, the LaTeX math presentation?
Or being able to produce own dumps more quickly?

tredlie
 
Posts: 44 | Thanked: 38 times | Joined on Mar 2010 @ Germany
#4
Originally Posted by pantera1989 View Post
And how will this be updated if the page is updated? Will it look through all the pages before updating? The N900 is an internet device. Offline viewing does kinda destroy the point.
Hm - there are quite some use cases where an offline Wikipedia might be quite handy ... areas with no 3G coverage, for instance ... or going abroad (data roaming is still expensive these days).

I used to look up terms quite often in my Mobipocket Wikipedia - but then again, that was on a Palm Tungsten with no Internet connection at all
 
Posts: 1,341 | Thanked: 708 times | Joined on Feb 2010
#5
Is the Wikipedia db compressed (bzip2 -9) when in offline locally?
 
Posts: 156 | Thanked: 44 times | Joined on Dec 2007
#6
Originally Posted by tredlie View Post
A couple of questions: I guess images are not really part of the installation?
what would be the realy advantage compared to evopedia, the LaTeX math presentation?
Or being able to produce own dumps more quickly?
Images are not part of the installation a) because that would just be insanely large, and b) because Wikipedia doesn't provide dumps of them. http://dumps.wikimedia.org/enwiki/20100312/ is where they are from.

LaTeX math presentation will be present as it is in the current desktop version. I am currently working to replace the PHP based parser with the Python-based mwlib. Then I just have to recode the bzip2 file extractor in Perl with some Python. After that, it's just replacing the Django webservice with a simple HTTP server, and then it shouldn't require anything special - apart from a few compiled bits.
 
Posts: 156 | Thanked: 44 times | Joined on Dec 2007
#7
There is also this: https://launchpad.net/wikipediadumpreader/ which I will have a look at - I haven't seen a Maemo package for it yet, but if it works decently well, I may port that instead, considering how Maemo is getting QT'd.
 

The Following User Says Thank You to t3h For This Useful Post:
Reply


 
Forum Jump


All times are GMT. The time now is 12:06.