A scalable and adaptable standardised user file structure?

I’ve got two more mock exams to cover (both on Monday), some University applications progress has been made, more and more of the ThoughtScore project is inching its way into my WIPUP profile (starting to get to the interesting bit), and I’ve opened Blender again to do a quick animation favour which can be seen in my Uncategorised category.

But this post is nothing to do with those fun and amazing things. No. Today I want to talk about something that warrants 3-4 syllable words being used in the post title.

The ins and outs of operating systems aren’t exactly my speciality but I do know that they have a uniform file structure. In Windows, a lot of system stuff goes in C:\WINDOWS, your user files goes in Documents and Settings, and programs go into Program Files. In Windows inside your user file you automatically get a set of folders such as My Documents, My Pictures, etc. In Linux (and all UNIX I believe) your user’s folder is in /home/username/ and other than having hidden dotfiles to store local application settings it’s basically empty. At most your desktop environment or distro will add a "Desktop" folder.

The question is how should I archive and organise my files such that they’re easy to find, easy to transport, and easy to manage. Sure, the Microsoft approach works for some people, but most of the people I know completely ignore that structure. I’ve been thinking about it for a while and have isolated a few ways people normally try to manage it.

By file extension. It’s easy to find and easy to archive. However it’s difficult to manage once you start getting rare file extensions or files with no extensions. It also becomes a pain when your program output or save files require several files of different types to be grouped together. Files can become strewn all around the computer and projects which require hardlinking to files will easily get out of hand.

By file type. This one is only slightly different to file extension in that people group things by what files are, not their extensions. For example "Images" instead of "jpg", "gif", etc. Pros and cons are similar to above.

By file purpose. This is a project based approach. Files are grouped into their uses, such as "Homework", or "Movies", or "Project X". This is rather commonly used but often clashes occur when used in parallel with others such as File extension or file type, especially when the nature of the project requires hardlinking to file locations.

By file attribute. This is often a temporarily used file structure when people are sorting out files. Such examples include "bobs photos", "to be sorted", "jazz music", etc. The main use of this structure is to make it easier to transport the files or manage a bulk collection of files. The sad story is that these directories persist waaaay past their useful lifetime and prove to be effective cloggers.

By time. Useful for archiving, completely regardless of the content. Used for management and for finding, but rarely useful for transporting files.

By referenced location. Some files are put in locations regardless of semantic value and instead for technical convenience – for example dotfiles, plaintext, logs, tmp, backup files, and referenced files from other apps (scripts, programs, graphic projects).

By organised chaos. This is probably how almost every single Windows user’s desktop looks like. A complete mess of random rubbish used to dump stuff. Files are strewn regardless of any attribute and found through searching, indexed searches, and file manager filters.

Now of course searching for the zen of a users’ file structure will only end when a structure is able to accomodate a large number of files, a project-based workflow, technical restraints, archives, and miscellaneous files. In this case it’s useful to define what "accomodate" means, which is an ability to easily find a desired file, to sort and prune undesired or irrelevant files, to prevent duplication, to transport similar files easily (eg: all within one directory), to quickly break down a collection into manageable chunks, and to allow any newcomer to intuitively adapt to your filesystem.

I currently run a structure where my homefolder is where I dump active files, I have a primarily "file type" structure, archives are done through "file purpose" (I despise time), and projects have their own substructures which are completely dependent on referenced locations. For me the biggest inconvenience is referenced locations, where I find myself unable to bulk manage files simply because of the inconvenience of have to re-reference their location. The rest is chaos. All in all, miles away from my personal zen.

Anybody who’s achieved a personal zen are welcome to share.


Tech tip #4: Copy a random set of files from a directory.

More for archival purposes than anything, today I wanted to copy some songs out of my serious mess of a music "collection" onto my microSD card. I didn’t want to have to choose and I haven’t rated my songs so that wouldn’t help. Instead I wanted a random selection of songs. I’m not a bashmaster (absolutely pathetic at it, actually) but this is what I ended up using – after symlinking all of the various directories I had my files under together:

find -L /home/drive/music -type f -name "*.mp3" | sort -R | tail -n100 | while read file; do cp "$file" /media/disk/music/; done

-n100 represents how many files are going to be copied. Hope it helps somebody! Of course any improvements are welcome.