Leaving the Ocean Unboiled

I’ve been wandering a bit about intellectually. There was a method to this madness. Here was my plan for profit:

  1. Semiotics — Separate items and their meanings. Rather than considering a song a discrete thing that a user has a preference for, think of it as a complex symbol that has meaning for a user.
  2. Memetics — Examine shared cultural myths as philosophies of human nature and argue that the process guiding their specification is the same as the one driving philosophies about the world toward sciences.
  3. Preference as Conditioning — Distinguish between cultural symbols (guys wear pants) and simple symbols (the sun meaning warmth), argue that music communicates both types and that a unified perspective of messages can incorporate both.
  4. Hidden Markov Models — Posit that preference arises from the conditioning of a relatively small number of elements. Attempt to use patterns in the expressed preferences to guess the layout of this hidden network. Introduce the concept of ego as a state maintenance function on a stateless network.
  5. Vector Distance — Come up with some sort of unified way to train a Markov model on cepestral coefficients, tags and lyrics and use the weights of the nodes as a vector for computing user similarity (or, by examining the networks of the users liking a certain thing, compute a vector representing the messages communicated by a complex sign).

So, kinda out there as an idea. I didn’t really mean for it to come out quite that strangely. The task I was given was, “come up with a collaborative filter.” Recall that my definition of “collaborative filter” is pretty amorphous. I’ve read several papers on collaborative filtering, but none of them is particularly explicit. The definition I got from wikipedia was:

Continue reading →

  • Share/Bookmark

Webmaster Tools and Wordpress 404

I have Google’s webmaster tools as a gadget on my Google start page.

You have to verify that you control a domain before it Google will tell you stuff like the top search phrases and RSS subscribers and whatnot. When I logged in this morning I noticed that a couple of the domains that I had previously verified by placing a specially named HTML file on the site were now unverified.

“Stupid Google,” I thought to myself. “Why would you forget that I had verified those things?” I was thinking that perhaps it was because I had removed the specially named files and Google wanted to be able to reverify every so often.

Turns out Google was in fact brighter than whoever wrote my Wordpress theme.

Every time your web browser gets a page from a webserver there’s a number, called a status code, associated with that transaction that categorizes the type of page that you are getting. If you try to access a page that doesn’t exist on my site you get a cute picture of a puppy. This is all fine and good except the status code that page sends is 200 (“OK”) which isn’t right. The reason Google unverified my site is it realized that my site reported every single page requested as being there rather than returning a 404 (“Not Found”) status for pages don’t exist.

This means that if someone else tried to put my page into webmaster tools and verify it then whatever the special filename that Google gives them would also come back as being present (status 200).

Smart as it is, Google made me fix this. So, I’m sorry Google for my lack of faith.

  • Share/Bookmark