kgutwin's Blog

cite u like filing system

I've been trying for a while to come up with a systematic way of filing physical paper papers in my filing cabinet. My current filing system is awful and not-useful. Here are some quick thoughts on how to do this right.

A good system would be:

  • efficient, meaning that there is a comfortable paper/folder ratio (not close to either 1 or infinity)
  • easily expandable, meaning adding new papers is O(1) and not O(n^2), for example.
  • stable, meaning papers should rest in their folders for a relatively long time, and that papers would be re-arranged only infrequently
  • topical, meaning papers are arranged in such a way that nearer-subject papers are close in physical proximity versus far-subject papers

I like grouping by author, but that gets tricky when there are too few papers by one author. It's stable but not efficient, and only mildly topical (inasmuch as authors tend to write about the same thing).

Automatically discovering key words is HARD. Especially difficult is the long tail problem, where parameters seem to work well on a subset but there remains a lot of poorly categorized papers, reducing the efficiency. Also, automatic procedures tend to not be too stable.

My recent idea was to use cite u like tags. Those are highly topical and flexible enough that efficiency could be achieved. However, how best to use tags, since some are highly general and others are highly specific?

My idea is to assign a paper to a 'tag' folder based on the tag with the fewest number of articles ABOVE A CUTOFF. This encourages that the chosen tag is both the most specific possible and not too specific that nothing is relevant. Also, authors names should be included in the list of possible 'tags'.

I haven't had time to implement this yet but I might get around to it soon. It sounds like it could work ...

Posted on 2008-12-16 18:23:23, 0 comments. Read this article.