Life & much, much more

I’m sick and tired of this ebook nonsense.

No. I like ebooks. At least the concept. I would love to be able to read books in beautiful, standardised print similar to that produced by LaTeX, on any device, on any screensize, without any problems like math reflow, images, and usage of ridiculous fonts. Oh, and DRM too. But that’s hell in itself.

But no. Ebooks are a mess. A big, honkin’ ridiculous pile of crap. A prime example of what not to do when converting a traditional medium to an electronic form. Why? Because of a lack of standardisation. There is no single format, due to (in a nutshell) firms not being able to talk nicely to one another, swallow their egos and agree. So now we’re stuck with 27 major formats (yes, count them) and each with its own little annoyance. Oh, and that’s without considering potential DRM being slapped on each one of them.

It’s not just the electronic format that is a mess – it’s the physical formatting too. Ebooks can be related to the pre-CSS days of HTML, filled with non-semantic markup and tables stuck around everywhere. Anybody who has experienced the LaTeX nirvana that is "this is a title, not a bold, size 26, centered font" can relate to this – whilst creative freedom is good, computers unfortunately suck at this and are unable to tell what is title and what is paragraph. Thus I am stuck with some ebooks doing nonsense like linebreaking at 80 characters, not telling me when paragraphs start and end, and oh yes- every single plaintext ebook with its own flavour of markdown.

Terror doesn’t stop there. It continues by plaguing the now-necessary routine of converting from format to format whenever you want to transfer from one device to another. Every time you format, it is inevitable that more non-semantic formatting is lost. This, of course, only happens if you can even convert it in the first place, thanks to our lovely friend DRM.

So what is the solution? The solution is threefold – 1) force (taunts and physical violence may and shall be used) all publishers to agree to use a single, open format, such as EPUB, and make that format use TeX markup. Thus ebooks will be distributed in plaintext with attached and compressed images. 2) Force (see previous) all publishers to agree to use a single repository to prevent duplication of effort (another of my pet peeves, thank you for noticing) and spend time manually and painstakenly correctly converting existing ebooks to this new format and dumping it in the repo. 3) Fix all the kinks to allow this TeX-structured ebook source to be then rendered or converted to any other format (eg: LaTeX-generated PDFs cannot reflow) should the retailer or consumer want, even if it means the retailer wants to affix some sort of DRM at this stage. If you noticed, this follows a very much source (TeX-structured format) and binary (whatever you render the TeX into) way of distributing ebooks. This is a win-win situation. Anybody can buy from anywhere without fearing incompatibility. Retailers still can satisfy their craving for DRM. EBooks are semantically-marked and rendered beautifully. Even the plaintext looks beautiful.

It turns out I’m not the first to come up with an alike proposal. A firm known as River-Valley has been cashing in on this opportunity by reformatting ebooks for their rather technical clients, and have made significant progress towards this goal, unfortunately though this project has been stalled for quite some time apparently. A few hopefuls at the MobileRead Forums have tried to make a start, but again I think it just died from lack of love.

But recently I had a wondrous epiphany to solve my woes once and for all. It was the sheer audacity to go against one of my joys in life – standards and conventions. The idea can be summed up in the two froody words "why bother?" Life is too short to care if your music collection is made up of oggs and not flacs or mp3s. Life is too short to bother to ensure that your metatags are using the ampersand corrently in place of "and". Life is too short to fix everybody else’s stupid mistakes that don’t fit your mental specification. So if you see somebody walking down the street reading a book where every sentence stops sharp at 80 characters, give them a pat on the back and congratulate them on finally getting their priorities straight.

Somebody please fix Nepomuk to make it do something useful like automagically sort my collections for me.

End rant.


A little introduction to MP3s

Hello there readers. Today I present to you yet another guest post by NathanKP from Inkweaver Review – please take some time to check out his website.

What is an MP3 and how does it work?

An MP3 is a file specifically designed for storing music. The term MP3 stands for MPEG layer 3 audio, the compression algorithm that is the basis of MP3. This algorithm is what encodes music and makes it possible to put it in a file such as the MP3 format. Real music is smooth analog wave forms that come directly from an instrument. When music is stored on a CD, however, it must be in a digital format or ones and zeros. Digital formats do not tend toward the accurate presentation of wave forms, so they must be approximated by using a sample rate. A CD samples the pure analog music about 44100 times a second and uses that to create a wave that is not purely smooth, but rather like stair steps. However, the human ear can’t really hear the difference without listening very carefully and training your ear. This sample rate is a type of compression, because analog music, on an LP for example, holds an infinite amount of data in each finite time period. CD sampling reduces this “infinite” file size to a mere 10mb a minute. However, that is still much too large for ordinary purposes.

MP3 is the next level of compression, able to store music data at approximately 1mb per minute. The way it does this is by simplifying the music by purposely losing some of the sounds. For one thing most humans can only hear a specific range of frequencies, in the 20 Hz to 20 kHz range. Some animals can hear sounds higher or lower than this but humans in general can’t. By cutting out sounds outside of this narrow range MP3 can greatly reduce file size.

Secondly MP3 reduces the sampling rate so that the wave approximations in the music have even more sharp “steps.” This, however, simplifies the wave forms by removing small variations. Then the music is encoded by using mathematical formulas to pull out data about the basic shape of the wave forms that make up the music.

Every kind of wave form can be approximated by a mathematical formula. Calculus and other math techniques can be used to fit math formulas to wave shapes. The math formulas have specific formulas that require much less storage space than a complex sample of music. The MP3 software algorithm uses code called a CODEC to handle this part of the MP3 compression. The CODEC uses statistical information about the shape of the wave forms to recreate them. It is sort of like graphing a complex calculus math problem. The problem might have only a few factors in it but the shape it creates can be quite complex. In this way MP3 is able to store the complex wave forms of music very efficiently.

Of course this is a very lossy technique. Not only is the frequency limited, and the sample rate reduced, but the music sounds itself are merely a mathematical approximation. However, most people can’t really hear the difference between MP3 music and CD music, or even pure analog music of an LP.

MP3 Software

There are a plethora of different MP3 players on the market. As far as free software for computers goes the very best are VLC player, a very light player that is easy on computer resources, and WinAMP, another free MP3 player that has been around for a long time.

Note from Dion Moult: I would also like to recommend “mplayer”.