I’ve been wandering a bit about intellectually. There was a method to this madness. Here was my plan for profit:
- Semiotics — Separate items and their meanings. Rather than considering a song a discrete thing that a user has a preference for, think of it as a complex symbol that has meaning for a user.
- Memetics — Examine shared cultural myths as philosophies of human nature and argue that the process guiding their specification is the same as the one driving philosophies about the world toward sciences.
- Preference as Conditioning — Distinguish between cultural symbols (guys wear pants) and simple symbols (the sun meaning warmth), argue that music communicates both types and that a unified perspective of messages can incorporate both.
- Hidden Markov Models — Posit that preference arises from the conditioning of a relatively small number of elements. Attempt to use patterns in the expressed preferences to guess the layout of this hidden network. Introduce the concept of ego as a state maintenance function on a stateless network.
- Vector Distance — Come up with some sort of unified way to train a Markov model on cepestral coefficients, tags and lyrics and use the weights of the nodes as a vector for computing user similarity (or, by examining the networks of the users liking a certain thing, compute a vector representing the messages communicated by a complex sign).
So, kinda out there as an idea. I didn’t really mean for it to come out quite that strangely. The task I was given was, “come up with a collaborative filter.” Recall that my definition of “collaborative filter” is pretty amorphous. I’ve read several papers on collaborative filtering, but none of them is particularly explicit. The definition I got from wikipedia was: