Linear Models
I haven’t been writing very much about the technical stuff lately for a few reasons. I’m wrapping up one project, and I am down to the fussiest of silly details, like how to make sure that the exported file is in the right format to be easily used by the person who requested it. For another project I need to reinstall a whole bunch of stuff on the server; that will either be a dull sequence of accepting the terms and conditions or else it will be a nightmare. And the newest project is something that I shouldn’t be saying much about.
I know that makes it seem very glamorous. But it’s not. I am not building some sort of sophisticated system that is going to disrupt anything. Rather, we have a lot of data that it’s not polite to talk about, and I’m trying to find features that are weird enough that a human should take a closer look. This data is a mix of text, numbers, dates, and more, and computers are not known for being good readers are having good judgement. (Yet.)
The one thing that I was hoping to implement today but did not finish was trying to decide whether there is a linear relationship between two of the collections of numbers. We’d been tacitly assuming that these collections were related linearly, in the linear algebra sense. But now I am wondering if they are related linearly in the linear regression sense (that is to say, affine). Maybe the false positives that I was getting from the old model were because the points were clustered near a non-zero intercept. Maybe these points really do sit near a line; it just might be a line that does not go through the origin.