Some Rockbox updates

[Posted February 1, 2006 by corbet]

Last week's Rockbox review was reasonably well received. Since then, however, a couple of things have happened - one good, one less so - which make an update in order.

Starting with the good news: the iPod port can now produce audio on the iPod Nano and 4G Color/Photo models. That means that there is now a totally free (if still a bit bleeding edge) firmware offering for this otherwise proprietary, DRM-equipped player. iPods running Rockbox will have all of the features described last week, including a much wider variety of codecs. The iPod Rockbox hackers have put a lot of work into this port, and congratulations are in order.

Support for a full-color "while playing screen" has also been merged since last week - a development which should reduce the number of people complaining that the Rockbox interface is ugly.

The bad news relates to the voice menu support which makes Rockbox so appealing to blind users (and some others as well). The best set of voices provided for Rockbox, by many accounts, was generated with a copy of ATT Natural Voices. Recently, the Rockbox developers got a friendly little cease and desist notice from the folks at Wizzard Software, the company which distributes that product in the U.S. By distributing the output from this program, says Wizzard, Rockbox was violating the end user agreement for the software.

So the ATT voices were pulled from the web site while the EULA was examined; further research seems to bear out Wizzard's claim. The licensing for that software is set up to require extra royalties if any voice output is redistributed or used in a product. So that set of voices is likely to be gone forever, and the developers are looking for replacements.

Some efforts are afoot to generate a set of voice files the old-fashioned way - by recording an actual human and editing the result. Sort of like Tom Baker making voice files for British Telecom. That is a labor-intensive way of solving the problem, however, and keeping the voice files current in such a fast-moving project involves quite a bit more labor. So an automated means for generating high-quality voice files would be a welcome contribution to the project. Perhaps a Festival expert would like to help them out?

Some Rockbox updates

Posted Feb 2, 2006 7:42 UTC (Thu) by nix (subscriber, #2304) [Link]

The *default* Festival voices have already been tried, and weren't really comprehensible enough to use as the sole interface (which is after all the point). But if someone knows enough to make them sound better...

Ogg

Posted Feb 2, 2006 9:04 UTC (Thu) by ncm (guest, #165) [Link] (2 responses)

Postings elsewhere (thanks linuxstb) reveal that the Rockbox port has been demonstrated to play 260 kbit/s Vorbis encodings reliably on the Nano. This is twice the rate claimed on the iPodLinux wiki, which leads me to suspect that wiki entry is simply obsolete. In any case, good news for Nano owners.

Hardware like the Nano (and suchlike) would be a lot more appealing if, in place of soldered-in flash memory, it had a CompactFlash or similar slot. The only such hardware I have been able to locate, that claim to be able to play Vorbis out of the box, are the Lexar Media LDP-800 and the MPIO FL100, neither (to my knowledge) in current production. Sandisk, Kuro, and Frontier Labs have had slot gadgets without Vorbis support. Of equal interest would be a memoryless gadget that would play files from any USB key, but I haven't found any such. Rockbox drivers for SD/MMC/CF or USB hosting would be a necessity to support any of these.

The problem with the fragmented MP3 player business is that there are too few people with any particular gadget to motivate a port.

Ogg

Posted Feb 2, 2006 13:01 UTC (Thu) by gravious (guest, #7662) [Link]

Are you not forgetting the iRiver H3xx series? These are also not in current production put play oggs out of the box.

Ogg

Posted Feb 2, 2006 17:41 UTC (Thu) by ncm (guest, #165) [Link]

The iriver H3xx series have hard disk drives, and cost more than 6 times as much as your typical memoryless player.

I have since learned of Sandisk and Lexar products (MP3 Companion and MPC-231 JumpGear MP3, respectively) that accept a (matching) USB key. Unfortunately neither decodes Vorbis, and neither got very good reviews. There's also a jWIN JX-MP93 gadget that takes SD cards, but again no Vorbis. It almost seems as if anybody who obtains the rights to implement WMA DRM agrees not to support Ogg alongside.

A Rockbox port for any of these (typically US$40-60) gadgets would be most welcome.

Some Rockbox updates

Posted Feb 2, 2006 10:36 UTC (Thu) by copsewood (subscriber, #199) [Link] (3 responses)

I really don't think as good voices can be obtained automatically as can be obtained by recording from a human reading. It's really a question of how many words and phrases need to be recorded. If the size of the job makes it impractical for just one reader, perhaps different reader's voices could be used for control prompts, information messages, errors and warnings.

One design issue that prefers using a robot voice is that much greater storage compression is possible if the voice is autogenerated on the fly from text when needed.

Some Rockbox updates

Posted Feb 2, 2006 16:21 UTC (Thu) by bronson (subscriber, #4806) [Link] (2 responses)

Except that these little boxes just don't have the grunt to autogenerate voice. My MP3 player is about the CPU equivalent of an Apple ][ (which is discouraging and, at the same time, absolutely stupefyingly amazing...)

Well, it's an Apple ][ with a 60 GB hard drive and an 11 hour battery life... That's the equivalent of 174762 floppies (5461 lbs). And you'd have to swap disks probably 8 times to play a single song.

Some Rockbox updates

Posted Feb 2, 2006 17:48 UTC (Thu) by ncm (guest, #165) [Link] (1 responses)

The CPU in your MP3 player is likely to be 20 to 150 times as powerful as the one in your old Apple ][. Given the screen resolutions lately available, you could probably run a full-speed Apple ][ emulator, itself running a video game and mapping color video operations to the LCD, without causing your music to skip.

Some Rockbox updates

Posted Feb 3, 2006 0:58 UTC (Fri) by bk (guest, #25617) [Link]

Exactly. A typical DAP is roughly as powerful as a high end PC circa 1995-ish. 120 MHz CPU (with frequency throttling), 32 MB RAM plus tons of storage. The devices based on the PortalPlayer chip (iPod, iRiver H10) are even dual core!

Some Rockbox updates

Posted Feb 2, 2006 11:48 UTC (Thu) by ekj (guest, #1524) [Link] (3 responses)

I don't quite understand how the number of required voicefiles can be all that large.

Assuming, offcourse that it doesn't include a spoen version of all existing artist-names, song and album-titles and the like, which is essentially infinity.

The words used in the interface itself ("Play", "Pause", "Volume up", "Poweroff", "Next Track" etc) can't *possily* be more than say a hundred standard phrases, can it ?

Or am I missing something ?

Some Rockbox updates

Posted Feb 2, 2006 12:18 UTC (Thu) by bk (guest, #25617) [Link] (2 responses)

Christi Scarborough has recorded the voice UI manually in the past, here's a recent post of hers to the rockbox-users mailing list:

Actually the killer is editing down the recording to seperate vice files. I prefer to do a minimum of three takes of each phrase so I can pick the best one. This then has to be manually picked out from the recording and saved to a long filename. It's a highly repetative and boring task. And then the things that need voicing change so often that they quickly beome out of date.
I'd be willing to do an up to date recording with my now much better recording equipment if someone else were willing to take my raw sound file and turn it into a voice file. Even reading five hundred odd phrases into a mic is not an insignificant amount of effort.
Christi

I don't know how they would tackle the localization issue, either.

Some Rockbox updates

Posted Feb 2, 2006 16:06 UTC (Thu) by kleptog (subscriber, #1183) [Link]

Interesting. Years ago, I and a friend wrote a simple DOS program to do recordings. It had a prompt where you could type a string. When you pressed space it started recording, space again to stop. The string you give was the prefix so you got name01.wav, name02.wav, etc...

At the time we were recording sound effects off a video so you'd type "explosion". Then whenever an explosion was about to start you hit space, afterwards you hit space. You could replay last and delete last IIRC.

In one evening we recorded dozens of sound effects. Maybe something like this would would be useful? It wouldn't be terribly hard to write.

Some Rockbox updates

Posted Feb 4, 2006 21:38 UTC (Sat) by tzafrir (subscriber, #11501) [Link]

Other projects, such as Asterisk, have tackled localization.

Theresult (in the case of Asterisk) is not perfect, but comprehensable.

Some Rockbox updates

Posted Feb 2, 2006 11:54 UTC (Thu) by MortFurd (guest, #9389) [Link]

Has anyone checked into using the MBROLA synthesizer to generate the voices? The samples I've heard sound pretty good.