The Following 5 Users Say Thank You to malfunctioning For This Useful Post: | ||
|
2014-10-22
, 11:01
|
Posts: 330 |
Thanked: 556 times |
Joined on Oct 2012
|
#2
|
The Following 3 Users Say Thank You to malfunctioning For This Useful Post: | ||
|
2014-10-22
, 11:16
|
Posts: 330 |
Thanked: 556 times |
Joined on Oct 2012
|
#3
|
The Following User Says Thank You to malfunctioning For This Useful Post: | ||
|
2014-10-22
, 13:45
|
Posts: 330 |
Thanked: 556 times |
Joined on Oct 2012
|
#4
|
|
2014-10-22
, 17:18
|
|
Posts: 2,355 |
Thanked: 5,249 times |
Joined on Jan 2009
@ Barcelona
|
#5
|
The Following 4 Users Say Thank You to javispedro For This Useful Post: | ||
|
2014-10-22
, 17:31
|
Posts: 330 |
Thanked: 556 times |
Joined on Oct 2012
|
#6
|
The eMMC (NOT the 256MiB NAND chip) should actually perform its own wear-leveling and error correction.
If it's giving you bad blocks then I assume that either
a) the firmware is buggy (not impossible),
b) ran out of spare blocks, which is a very bad thing: Assuming that the firmware is not buggy, then the blocks have been "uniformly used". So if a large amount of blocks is failing now, the huge majority will also fail "soon".
|
2014-10-22
, 17:35
|
Posts: 330 |
Thanked: 556 times |
Joined on Oct 2012
|
#7
|
|
2014-10-23
, 15:17
|
Posts: 330 |
Thanked: 556 times |
Joined on Oct 2012
|
#8
|
|
2014-10-23
, 16:59
|
|
Posts: 2,355 |
Thanked: 5,249 times |
Joined on Jan 2009
@ Barcelona
|
#9
|
I ran mkfs.ext3 with -cc from the Linux machine. Then, I performed a test by copying data to the N900 from the Linux machine, and one of the big files is still bad. I guess it's just a question of time before the flash memory on the N900 dies.
BTW, somebody elsewhere mentioned that to increase flash memory life:
- It's a good idea not to use ext3 but ext2.
- The volume should be mounted as noatime and async.
The second piece of advice makes sense, but I wasn't aware about ext3 causing more writes to flash.
Not only that, but fsck reports no problems, and failures are not transparent to the user. Even more, I can mount this file (an Easy Debian image) and it apparently works, even if its bit structure lacks integrity.
This is extremely troubling, as I only found out this file is corrupt by running md5sum on it. Nobody can be expected to md5sum every single file in their linux filesystem.
The Following 7 Users Say Thank You to javispedro For This Useful Post: | ||
|
2014-10-23
, 22:21
|
Posts: 330 |
Thanked: 556 times |
Joined on Oct 2012
|
#10
|
Yes, specially if you're getting _new_ bad blocks every often.
Things you should check:
- Stop testing from your PC using USB, because it may be a bad USB cable.
- When you ran badblocks on N900 did you get new ones every often? Did the bad blacks "move"? Did they look random? It may mean the problem is caused by some bad contact around the eMMC chip.
Please note that every time you run badblocks you're causing additional wear on the eMMC.
- Have you tried reflashing _both_ the N900 and the eMMC?
Unless it turns out to be a software side problem, if the number of blocks keeps increasing your N900 is as good as dead. Maybe the Neo900 guys can make some use for spares... or try find someone willing to replace the eMMC chip.
ext3 causes more writes because of the journal, albeit in the default configuration ('writeback' iirc) the actual difference is rather small.
This is not new at all. Fsck only detects inconsistencies in metadata. It does not detect metadata corruption, much less corruption in the actual data. This is the same in most current desktop filesystems -- ext*, FAT, NTFS (Windows), HFS+ (OS X), etc.
If you are really concerned about filesystems that guarantee data integrity you need to go towards cluster filesystems or at least something more advanced such as ZFS or btrfs.
The Following 2 Users Say Thank You to malfunctioning For This Useful Post: | ||
Basically, I copied a large (4Gb) file, and whenever I tried to read it (or to md5sum it) I would get a read error (md5sum just fails silently in this case by default).
This is how I think I corrected the issue:
1. Reboot into Backup Menu.
2. Mount all partitions in storage mode (read/write).
3. Connect via USB to Linux laptop.
4. Run this command on the bad partition (which happened to be /dev/sdb5):
sudo mkfs.ext3 -c -b 1024 /dev/sdb5
This detected a number of errors (not too many, just a few), and wrote that information into the filesystem to prevent the OS from using those sectors.
Anybody else have any experience with this? If it works, at least it's reassuring to know that the problem can be remedied.