Archive for computers

Distributed Democracy

I’ve been looking into buying some bitcoins and have been navigating the vagaries of identity confirmation. The market I’ve heard the most about is Mt.Gox. Getting money to the exchange is done through Dwolla. They require a copy of a photo id. Mt.Gox also has a verification process which requires a photo id and a proof of residence.

The proof of residence is a utility bill or a voter registration. Say for instance you’re a student who rents. You could not be able to trade. I’ve been thinking of an alternative. A person creates an account online and requests a code to be sent to them via mail. Once that code is entered back into the system, it verifies a person’s address.

This same process could be combined with the voter registration process to provide an electronic component to civic interactions. Rather than everyone voting on every bill, people would still specify proxies for most of their decisions. The process could be more dynamic, however, and with more granularity in expertise. Rather than a single politician making decisions about a broad spectrum, votes on issues could be divided up and proxied to different people.

Another application could be in conjunction with a housing site. It could be used to verify that a particular account controls a particular address.

Leave a Comment

Iterative Science Experiments

In my computational modeling class on Friday, Dr. Palmeri discussed a type of iterative experimentation put forth by Jay Myung & Mark Pitt at Ohio State.

The basic work of the course is models of the mind that make predictions of human behavior. Specifically, at this point we are looking at when there are competing models and researchers wish to distinguish which is the best able to produce realistic data.

The example being used was the discrimination of memory retention models and the time points at which to test for retention so as to optimally distinguish between models. The process starts off and as data is collected, best-fitting values for the parameters of the models can be calculated.

Given those parameters then, there are memory retention test points where the competing models will be expected to give the most divergent predictions. The data collection algorithm then can adjust the experimental setup to collect at those points.

The data from those points, is then added to the dataset that distinguishes the best-fitting parameters and the process iterates again.

Essentially, the experimenter is an AI that is mathematically tuning the experiment in real time to maximize the probability of a conclusive finding. Science is awesome, isn’t it?

Leave a Comment

Most Expensive Thing Ever

One of my classes this semester is on multimedia systems. It’s a seminar class where the student presentations make up the second half of the semester.

I’ve decided to do my bit on multitouch displays (touch screens with multiple styluses ala the iPhone), and for the project I want to actually build one.

To this end I need a projector. I hunted around, picked one out, and my credit card rejected the purchase as a fraud. I called, got it cleared, and attempted to buy it again. It was again flagged as a fraud.

I was irritated until I realized that at $750 this is easily more than twice as expensive as anything I’ve purchased on this card in the two years I’ve had it. I suppose I can’t blame the algorithms for being suspicious. ☺

Leave a Comment

Blog Rises Again

After two weeks of massive failure (which apparently comprises a medium severity problem for Dreamhost) they have restored my site from an offsite backup.

I’m certainly going to keep this in mind for the future when I have sites that actually need to be up. Two weeks is absurd for just about any problem short of your facility burning down (which theirs didn’t).

A few interesting things happened in the last couple weeks, mostly my weekends one of which was a huge hippie love-fest in woods (which made me realize how much I enjoy the hippies) and massive amounts of sociology which I have some more analytical thoughts on that I’ll post about later.

Leave a Comment

Little Dutchboys

I wanted to post a little synthetic idea connecting something I observed watching Wanted this weekend with Tai Chi boxing and Malcolm Gladwell’s Blink.

Since Jenni’s copy of Blink is in Baltimore, I figured I’d just go on Fictionwise and buy a copy so I could just cut and paste from the ebook. They’ve got it, but I can’t read it though. The DRM they use to keep people from sharing it isn’t supported on my Ubuntu desktop.

I probably wouldn’t buy the DRMed version anyway. So long as someone else controls my ability to control access to my media, I risk ending up like folks who bought from MSN Music. Once the service went belly-up they turned off the computers allowing people to access their music. Not a problem unless you do something crazy like get a new computer.

After I gave up on Fictionwise, I thought perhaps Amazon might be able to help me out, but you can only buy Kindle books if you have purchased a Kindle from Amazon.

I eventually got tired of screwing around with it, looked at a page on Amazon’s preview, entered a sufficiently long sentence to be unique into Google, and just pirated the book. I really did try to buy the book but, ironically, they were so busy trying to keep me from stealing it that they wouldn’t take my money. (I guess I could go ahead and pay for it now, but I really don’t want to give the businesses the impression that what they’re doing satisfies me as a customer.)

Read the rest of this entry »

Comments (4)

Item:Item Artist Similarity

Item to item similarity is a method popularized by Amazon for computing the similarity of items in its catalog. The reasoning is that item similarities are more static than user similarities and so in situations where finding the similarity requires extensive computation they have an advantage in robustness to infrequent updates.

Read the rest of this entry »

Leave a Comment

TF-IDF Applied to Recommendations

One of the possibilities that Aura is exploring is to leverage an existing TF-IDF implementation in minion.

Read the rest of this entry »

Leave a Comment

Array Indexing Methods In Java, R and Python

I’ve been working a bit more with R doing some data analysis. I keep getting hung up on the array access semantics and wanted to write them out so I could remember them.

Arrays are a very common data structure in computer science and most every language has support for them. Most intro classes describe an array as a set of boxes where you can stick stuff and refer to them by number.

Read the rest of this entry »

Comments (3)

Leaving the Ocean Unboiled

I’ve been wandering a bit about intellectually. There was a method to this madness. Here was my plan for profit:

  1. Semiotics — Separate items and their meanings. Rather than considering a song a discrete thing that a user has a preference for, think of it as a complex symbol that has meaning for a user.
  2. Memetics — Examine shared cultural myths as philosophies of human nature and argue that the process guiding their specification is the same as the one driving philosophies about the world toward sciences.
  3. Preference as Conditioning — Distinguish between cultural symbols (guys wear pants) and simple symbols (the sun meaning warmth), argue that music communicates both types and that a unified perspective of messages can incorporate both.
  4. Hidden Markov Models — Posit that preference arises from the conditioning of a relatively small number of elements. Attempt to use patterns in the expressed preferences to guess the layout of this hidden network. Introduce the concept of ego as a state maintenance function on a stateless network.
  5. Vector Distance — Come up with some sort of unified way to train a Markov model on cepestral coefficients, tags and lyrics and use the weights of the nodes as a vector for computing user similarity (or, by examining the networks of the users liking a certain thing, compute a vector representing the messages communicated by a complex sign).

So, kinda out there as an idea. I didn’t really mean for it to come out quite that strangely. The task I was given was, “come up with a collaborative filter.” Recall that my definition of “collaborative filter” is pretty amorphous. I’ve read several papers on collaborative filtering, but none of them is particularly explicit. The definition I got from wikipedia was:

Read the rest of this entry »

Leave a Comment

Webmaster Tools and WordPress 404

I have Google’s webmaster tools as a gadget on my Google start page.

You have to verify that you control a domain before it Google will tell you stuff like the top search phrases and RSS subscribers and whatnot. When I logged in this morning I noticed that a couple of the domains that I had previously verified by placing a specially named HTML file on the site were now unverified.

“Stupid Google,” I thought to myself. “Why would you forget that I had verified those things?” I was thinking that perhaps it was because I had removed the specially named files and Google wanted to be able to reverify every so often.

Turns out Google was in fact brighter than whoever wrote my WordPress theme.

Every time your web browser gets a page from a webserver there’s a number, called a status code, associated with that transaction that categorizes the type of page that you are getting. If you try to access a page that doesn’t exist on my site you get a cute picture of a puppy. This is all fine and good except the status code that page sends is 200 (“OK”) which isn’t right. The reason Google unverified my site is it realized that my site reported every single page requested as being there rather than returning a 404 (“Not Found”) status for pages don’t exist.

This means that if someone else tried to put my page into webmaster tools and verify it then whatever the special filename that Google gives them would also come back as being present (status 200).

Smart as it is, Google made me fix this. So, I’m sorry Google for my lack of faith.

Leave a Comment

Older Posts »