A scalable and adaptable standardised user file structure?

I’ve got two more mock exams to cover (both on Monday), some University applications progress has been made, more and more of the ThoughtScore project is inching its way into my WIPUP profile (starting to get to the interesting bit), and I’ve opened Blender again to do a quick animation favour which can be seen in my Uncategorised category.

But this post is nothing to do with those fun and amazing things. No. Today I want to talk about something that warrants 3-4 syllable words being used in the post title.

The ins and outs of operating systems aren’t exactly my speciality but I do know that they have a uniform file structure. In Windows, a lot of system stuff goes in C:\WINDOWS, your user files goes in Documents and Settings, and programs go into Program Files. In Windows inside your user file you automatically get a set of folders such as My Documents, My Pictures, etc. In Linux (and all UNIX I believe) your user’s folder is in /home/username/ and other than having hidden dotfiles to store local application settings it’s basically empty. At most your desktop environment or distro will add a "Desktop" folder.

The question is how should I archive and organise my files such that they’re easy to find, easy to transport, and easy to manage. Sure, the Microsoft approach works for some people, but most of the people I know completely ignore that structure. I’ve been thinking about it for a while and have isolated a few ways people normally try to manage it.

By file extension. It’s easy to find and easy to archive. However it’s difficult to manage once you start getting rare file extensions or files with no extensions. It also becomes a pain when your program output or save files require several files of different types to be grouped together. Files can become strewn all around the computer and projects which require hardlinking to files will easily get out of hand.

By file type. This one is only slightly different to file extension in that people group things by what files are, not their extensions. For example "Images" instead of "jpg", "gif", etc. Pros and cons are similar to above.

By file purpose. This is a project based approach. Files are grouped into their uses, such as "Homework", or "Movies", or "Project X". This is rather commonly used but often clashes occur when used in parallel with others such as File extension or file type, especially when the nature of the project requires hardlinking to file locations.

By file attribute. This is often a temporarily used file structure when people are sorting out files. Such examples include "bobs photos", "to be sorted", "jazz music", etc. The main use of this structure is to make it easier to transport the files or manage a bulk collection of files. The sad story is that these directories persist waaaay past their useful lifetime and prove to be effective cloggers.

By time. Useful for archiving, completely regardless of the content. Used for management and for finding, but rarely useful for transporting files.

By referenced location. Some files are put in locations regardless of semantic value and instead for technical convenience – for example dotfiles, plaintext, logs, tmp, backup files, and referenced files from other apps (scripts, programs, graphic projects).

By organised chaos. This is probably how almost every single Windows user’s desktop looks like. A complete mess of random rubbish used to dump stuff. Files are strewn regardless of any attribute and found through searching, indexed searches, and file manager filters.

Now of course searching for the zen of a users’ file structure will only end when a structure is able to accomodate a large number of files, a project-based workflow, technical restraints, archives, and miscellaneous files. In this case it’s useful to define what "accomodate" means, which is an ability to easily find a desired file, to sort and prune undesired or irrelevant files, to prevent duplication, to transport similar files easily (eg: all within one directory), to quickly break down a collection into manageable chunks, and to allow any newcomer to intuitively adapt to your filesystem.

I currently run a structure where my homefolder is where I dump active files, I have a primarily "file type" structure, archives are done through "file purpose" (I despise time), and projects have their own substructures which are completely dependent on referenced locations. For me the biggest inconvenience is referenced locations, where I find myself unable to bulk manage files simply because of the inconvenience of have to re-reference their location. The rest is chaos. All in all, miles away from my personal zen.

Anybody who’s achieved a personal zen are welcome to share.


Tech tip #4: Copy a random set of files from a directory.

More for archival purposes than anything, today I wanted to copy some songs out of my serious mess of a music "collection" onto my microSD card. I didn’t want to have to choose and I haven’t rated my songs so that wouldn’t help. Instead I wanted a random selection of songs. I’m not a bashmaster (absolutely pathetic at it, actually) but this is what I ended up using – after symlinking all of the various directories I had my files under together:

find -L /home/drive/music -type f -name "*.mp3" | sort -R | tail -n100 | while read file; do cp "$file" /media/disk/music/; done

-n100 represents how many files are going to be copied. Hope it helps somebody! Of course any improvements are welcome.


A little introduction to MP3s

Hello there readers. Today I present to you yet another guest post by NathanKP from Inkweaver Review – please take some time to check out his website.

What is an MP3 and how does it work?

An MP3 is a file specifically designed for storing music. The term MP3 stands for MPEG layer 3 audio, the compression algorithm that is the basis of MP3. This algorithm is what encodes music and makes it possible to put it in a file such as the MP3 format. Real music is smooth analog wave forms that come directly from an instrument. When music is stored on a CD, however, it must be in a digital format or ones and zeros. Digital formats do not tend toward the accurate presentation of wave forms, so they must be approximated by using a sample rate. A CD samples the pure analog music about 44100 times a second and uses that to create a wave that is not purely smooth, but rather like stair steps. However, the human ear can’t really hear the difference without listening very carefully and training your ear. This sample rate is a type of compression, because analog music, on an LP for example, holds an infinite amount of data in each finite time period. CD sampling reduces this “infinite” file size to a mere 10mb a minute. However, that is still much too large for ordinary purposes.

MP3 is the next level of compression, able to store music data at approximately 1mb per minute. The way it does this is by simplifying the music by purposely losing some of the sounds. For one thing most humans can only hear a specific range of frequencies, in the 20 Hz to 20 kHz range. Some animals can hear sounds higher or lower than this but humans in general can’t. By cutting out sounds outside of this narrow range MP3 can greatly reduce file size.

Secondly MP3 reduces the sampling rate so that the wave approximations in the music have even more sharp “steps.” This, however, simplifies the wave forms by removing small variations. Then the music is encoded by using mathematical formulas to pull out data about the basic shape of the wave forms that make up the music.

Every kind of wave form can be approximated by a mathematical formula. Calculus and other math techniques can be used to fit math formulas to wave shapes. The math formulas have specific formulas that require much less storage space than a complex sample of music. The MP3 software algorithm uses code called a CODEC to handle this part of the MP3 compression. The CODEC uses statistical information about the shape of the wave forms to recreate them. It is sort of like graphing a complex calculus math problem. The problem might have only a few factors in it but the shape it creates can be quite complex. In this way MP3 is able to store the complex wave forms of music very efficiently.

Of course this is a very lossy technique. Not only is the frequency limited, and the sample rate reduced, but the music sounds itself are merely a mathematical approximation. However, most people can’t really hear the difference between MP3 music and CD music, or even pure analog music of an LP.

MP3 Software

There are a plethora of different MP3 players on the market. As far as free software for computers goes the very best are VLC player, a very light player that is easy on computer resources, and WinAMP, another free MP3 player that has been around for a long time.

Note from Dion Moult: I would also like to recommend “mplayer”.


What is FTP?

Dear readers, today I present to you another guest post giving an introduction to FTP by  the wonderful NathanKP. For those interested in suggesting their own topics or writing a post to be published (you will be credited accordingly, we have a new “Spam Us” link up there on the navigation just for that very use :) Enjoy.

Right, a short introduction.

FTP is a protocol, or communication technique, that runs on the internet. Unlike the HTTP protocol which is designed specifically for transmitting HTML and XHTML documents, the FTP protocol is designed to transmit just about any type of file between computers. Since FTP is a different protocol it has its own prefix. When browsing the internet using a browser it is common to access addresses with the prefix “http://”. However FTP uses a different prefix: “ftp://”.

FTP is a very flexible protocol in that it makes file distribution easier when you are dealing with different operating systems, different file storage systems, or character encodings. Unlike the difficulty of setting up a file sharing network between a Unix and a Windows computer, setting up FTP is much easier because both computers can “talk” the common FTP language.

What is an FTP site?

An FTP site is like a file cabinet where files are stored. Like a web server which stores the HTML documents that internet users can access, an FTP server stores files that can be distributed to users. When a user browses to a web url that begins with “ftp://” the FTP server responds and sends a list of the available files to the persons browsing program. This list forms the FTP site itself.

An FTP site can also include security measures to prevent malicious users from performing denial of service attacks or to limit the people who can download the data from the FTP site. For example a company might have an FTP server so that its programmers can all access global project files. However, it would not be good if just anyone could get on the FTP site and steal the companies source code files.

Therefore FTP servers often check the domain names of their users against a internal list of known and trusted people. They also require a login process. For public FTP sites that anyone can use there is often a login where the username is “anonymous” and the password is your email address, which the FTP server will store for future reference.

More secure FTP servers will require a registration process which gives you a real username and password that allows you access to the FTP site.

What is an FTP client?

The FTP client is the program which you use to view and download files on an FTP server. Just like a browser is required to view webpages, an FTP client is needed to see the file list on an FTP server and download the files. The transfer language and protocol used wouldn’t make sense to most users just as pure HTML wouldn’t be very useful to someone who wanted to view a webpage. That is why the FTP client is needed to interpret.

FTP clients come in many flavors. Some are graphical, operating much like the Explorer program on your computer. They show the list of files on the FTP server and give you a convenient way to transfer them to your local computer, usually by drag and drop. A command line FTP client may require you to enter the exact filename of the file you want to download.

However there are many different free FTP clients on the internet, so it should be easy to find one that it is easy for you to use.

This is a guest post from none other than NathanKP from Inkweaver Review.