Based on how long it has been since I have updated this blog, I am now confident that the sticky note that says “write about random numbers” has been on my laptop for several weeks. This is supported by the fact that the ink is somewhat smudged and that the other sticky notes are arranged in such a way that the random numbers note looks like it was there before them.
Part of the reason for blog silence goes under the general umbrella of “interview stuff” and part of it is “don’t snark about the landscapers until the project is done.”
I got my lesson on the irrigation system yesterday, which signals that the project is pretty much done; I also wrote a large check, which supports that theory. We are not sure what gets watered by zone 10, nor have we determined which zone waters my lemon tree. My beloved zone 11 has been remapped to zone 9. Unlike the iPod that I had many years ago, the controller for the sprinkler system does not have a “shuffle” mode. I need to make decisions about how much water each zone will get how many times per day and how many times per week. The lemon tree might be on its own.
There is a project that I worked on last year and wrote thousands of lines of code. Right before it was supposed to be merged to master, the stakeholders decided that they wanted to make major changes, but that they were not sure exactly what needed to be done, so the project was put aside. (Added wrinkle: I also wrote a bunch of API endpoints – on an entirely different system – as helpers for this project, and they have been in production for nearly a year – but based on the older version of the spec.) Yesterday I discovered that they have come up with a plan of what they want and that someone who is not me is going to be working on this. (They have no idea what a good call they made by asking someone else to work on it.) I let her know where to find my previous work. I also let her know that the test server that can be used for those API endpoints always has a highly sanitized database that contains none of the information that is needed to actually test them. Since it’s not anywhere in my branch, I also pasted into the Slack thread the chunk of SQL that I use to fill the database with random data to be used for testing.
A house on my block went up for sale a little over a month ago for $1.3 million. It is currently listed as “pending.” This is crazy-money because I live in north county inland, roughly 35 miles from downtown San Diego. It is routinely 20 degrees hotter here than it is in coastal areas. People ask if it is safe to live in this city; we have uniformly terrible schools; someone at work refers to this city with the suffix “-ghetto” because it is known for having a lot of poor people. Yes, it’s a much nicer house than mine, but is it really worth that much? Hashtag California real estate. No matter what economics class tried to teach me about efficient markets, I find it hard to believe that there is any rhyme or reason about the price of houses in San Diego County. For years politicians have been telling us that California is going to be a ghost town because it is too expensive and everyone is moving to Texas. I’ve never had any interest in moving to Texas. I’ve come to discover that some people in Texas support my point of view.
This one is clearly about interview stuff: We are interviewing people to do things that are similar to what I do, so I was put in charge of setting up the interview tasks. So far we have not had much in the way of interview tasks that test the candidate’s skills with SQL. I’ve developed two sets of tasks: One for the preliminary phone screen (hello candidates who are googling me because you have phone screens this week or next week) and one for later in the process. The one for the later round needs to be similar in spirit to the data that we use on a daily basis, but it can’t be real because our real data comes from small children, and it would be wrong for us to share it – even anonymized – with interview candidates. I needed to conjure data out of thin air.
When creating the generated data for the interview task, I came face-to-face with the oft-repeated sentiment that R is slow. I’ve always brushed that off because most things that I’ve done with R have finished running in a few seconds. Is it worth it for me to write statistical code in a language not designed for statistics in order to save the computer a few seconds of work? Up until now, no. This interview task needs to parallel our real data in size and complexity, so I needed to make a lot of fake data. To make sure the relationships were not perfectly deterministic, I needed random number generators to fuzz the connections between some variables. I started my data-generation code running on a Friday afternoon, and it was still running Monday morning. At which point I decided to rewrite it in Node.js, even though it is definitely not meant for statistical programming. During this process I learned that d3 (an interactive visualization library used for making nice data visualizations for the web) has a really nice collection of random number generators. I was able to rewrite the entire thing in node before the R version finished running. The node version did take a few minutes to run, but I can live with that.
There is probably more that I could say about how I made my random data other than “R is slow, and d3 came through for me,” but this would reveal the total Rube Goldberg machine of infrastructure that I am dealing with as well as my short-term thinking of writing the data to the database belonging to the server where the data generation script is running rather than to the database belonging to the interview server.
On the other hand, the phone screen SQL task is pretty standard stuff without any nuance or complication to the data. You would probably write very similar queries when interviewing for other companies; if you have an understanding of fundamentals, you should be fine.
I spent this weekend trying to get caught up on my knitting. I can’t show your the sweater right now because I am having a lot of angst about it. I was trying to kitchener together some 2x2 rib, and I don’t want to talk about it. I’m also still a bit worried that the arms are too long and/or the body is too short. These are all solvable problems, but I have a lot of other things on my mind right now. For example, the customer service rep who is allegedly handling my insurance claim for the damage to my gate has not confirmed receipt of the two estimates for having my gate repaired.
Meanwhile, I spent the better part of yesterday knitting up some swatches.
My goal is to find some sort of stitch that I can do mindlessly while doing other stuff and then use it to make scarves of some sort out of all the yarn that I impulse-bought but that I don’t have enough of to make anything big out of. I’ve previously tried to make things out of an assortment of various yarns. It never goes well, and the finished object usually ends up in the trash.
The ring-lace stitch isn’t a good choice for mindless because I have a really hard time keeping track of where I am in that pattern. I feel like I would need to have a stitch marker every six stitches in order to have any hope of making anything at all with that stitch. I will note that I have tried this lace pattern many different times over the years, and this is the first time that it was not a total fail. The secret is to use a normal needle size and don’t think that going up a needle size is going to help. Also I tugged on it really hard (in the vertical direction), which made the holes much better defined.
You can’t really see all the things that I tried on the other swatch (btw, I used a US 5 needle for everything there; the stitches really are that different in size). Near the top I have some variations on linen stitch that probably fall under the umbrella of twill stitch. These aren’t mindless either because you need to keep track of where you are, and there is too much purling.
Sort of in the middle are some variants on the cartridge belt rib stitch. Much better because there is no purling.
The stripey one in the middle is brioche stitch, but done back-and-forth instead of sliding it to the other end of the needle. The vertical striped one near the bottom is a slight variation on brioche stitch where you slip the same stitches in both directions after you change colors.
Depending what happens in terms of this whole “return to the office” nonsense, I might also need to knit myself something to keep me warm while at work. The current office has a terrible HVAC system where it is always too cold somewhere in the building – and I usually have to be in that particular somewhere. We have too many people to go back to the old office, so we are getting a new office in December. However, no word yet on how cold it is in the new office. Thus, I am holding off on buying new yarn and trying to knit my way through the backlog.
In addition to the normal chaos that happens around here, we had some unexpected annoyance yesterday.
We’ve decided that the nice new house deserves nice new furniture, so we have been working on that. So far we have bought a rug (ok, not furniture, but part of the whole home-dec theme) and a cabinet.
We interviewed two decorators for this project. One of them (who we didn’t end up hiring) does all-custom work for the furniture and will sometimes buy back pieces from her old clients. I mention this because the cabinet that she recommended had previously been owned by one of the Real Housewives of Orange County (Vicki). I admit that I was intrigued by the possibility of owning a cabinet (well, in this case, a credenza) that had previously been on television.
Using the more conventional decorator means that our cabinet came from one of those furniture stores that you will see in your Facebook ads if you do anything at all that gives your browser cookies a “furniture” sort of vibe. We ordered the cabinet about a week and a half ago, and I got an email last week that it was going to be delivered. I got to pick which delivery option I wanted (I went with the $20 option for having the crew bring the package just inside the front door since it is a 100-pound cabinet). Then I clicked some options for getting texts about the status, clicked something to confirm my delivery time etc., etc.
The cabinet arrived yesterday. The cabinet itself is fine. It is a very nice cabinet.
The issue was with the delivery. This is a major furniture vendor, and this is a fairly large metropolitan area. There is a lot of furniture being delivered each day, so my cabinet was delivered in a very large box truck. In terms of the size, I’m guessing it was roughly equivalent to the largest U-Haul you could rent. This is a substantial truck. It would have a decent amount of momentum. It’s probably only slighly narrower than my driveway. Backing this truck up my (fairly steep) driveway would be very challenging.
Turns out that the driver was not up for the challenge, as he clobbered my gate.
The furniture company pinky-swears that they will coordinate everything for me to file a claim with the delivery contractor’s insurance company. So far I have not yet received the paperwork that they promised to email, so we shall see how this unfolds.
The yarn store has been sending me email. The message is: You bought a sweater’s worth of yarn about seven months ago and haven’t bought any yarn since then. Aren’t you done with that sweater by now? Shouldn’t you be buying more yarn? This reminds me of when I was in grad school and people would ask me how my dissertation was going. Actually, this is much better than grad school because the sweater is actually coming together pretty well, and a certain professor will not be examining every single stitch and making lots of critical comments.
I’m making the Botanical Yoke Pullover from Purl Soho. I wasn’t particularly happy with the way that the neckline looks on the model in the pattern photo, so I also added some short rows following a pattern for a totally different sweater.
I’m currently running into two problems making this sweater. No, wait, three.
Sophie loves to lie on the pattern. Sophie says that I should not consider her to be a problem. Thus, I will say that the first problem is that 12-over-12 cables are really annoying.
Too many other things to do. I literally spend hours per week watering plants that are out of range of the sprinkler system. I feel a little bit bad about this because we are having a drought in California. I’m also having more plants put in later this summer and am having mixed feelings about that. But some of the new plants are trees, and parts of my yard really, really, really need the shade. Also in terms of my yard, at some point I need to go out and rake flowers because jasmine flowers don’t seem to disintegrate the way that most other flowers do. I also have some tree that has been dropping yellow flowers to such an extent that I need to rake them or sweep them or something. And since we don’t have real seasons here, some plant is always dropping leaves instead of them all being nice enough to do it all at once in the fall.
I think that the arms of the sweater are too long. I really struggled to get row gauge as I was knitting. I have knit and re-knit the sleeves a total of four times (two sleeves \(\times\) two attempts). I think that they are still wrong. If they turn out to be only a little bit long at the end, then I am going to ignore that. If they are just past the point of ignoring, I’m going to unravel from the cuff end and then bind off with a crochet hook, even though that’s not ideal for ribbing. If they are off enough that the sleeve shaping is wrong, I’m going to have to remove the bottom third or half of the sleeve, reknit that part (YET AGAIN) with fewer rows between decreases, and then kitchener the replacements into place.
I was going to spend my weekend knitting, organizing my knitting stuff, and selling excess knitting stuff. (WOULD YOU LIKE SOME MACHINE KNITTING MAGAZINES FROM THE 1980s?)
However, I have finally gotten irritated at the way that San Diego County is(n’t) reporting its COVID-19 numbers. It used to be reported daily. I wasn’t entirely happy with the way that they broke down the data, but it was easy for me to go to their webpage and see the daily number of new cases. Now that we have decided that the pandemic is over, they announced that they were only going to update the numbers once a week. Apparently, they changed their minds, and now they are reporting on weekdays and each update shows the number of new cases since the previous update – so in order to get new cases per day, you need to know how many days since the latest update.
No problem, I thought, I will just dust off my old code for getting data from the County’s API. (Aside: Why does government love ArcGIS so much?) Turns out that the data served by the API is also updated on the same sporadic schedule (but at least the payload has dates).
Next up: Checking with the state. You would think that there was someone in the state of California who could put together a reasonable API. I think that I can get the data that I need, but it took a lot of digging to find it. The state’s data might also only be sporadic. I may have to monitor this over the next few days before I make a dedicated page that I can use to keep track of things.
The state’s data is not great either. I really want to see the case data broken down by age. I really, really want to know how much I should be annoyed that the new cases are coming from unvaccinated adults (most of whom should know better) vs. unvaccinated children (typically not their fault).
While there is a chance that they’d consider binning things by age, there is no chance that they’re going to break things down by vaccination status (especially since a lot of people are bad at interpreting numbers). That’s really what I want to know so that I can decide how much of a risk it is for me to go to Trader Joe’s and the yarn store. I’ve gotten so frustrated buying yarn online and being disappointed by the colors.
Fun fact about the state API: you send SQL to the endpoint, and it returns the result of your query. In order to prevent you from doing SQL injection the state merely wraps your entire query in
SELECT * FROM () AS blah LIMIT 32001. The “blah” is not a placeholder that I am using here; it is really part of what is processed by the endpoint. There doesn’t seem to be anything that keeps me from doing some sort of gnarly self-join on the table, but fortunately, today’s task is simple enough that I won’t accidentally write a query that grinds their server to a halt. They’re running Postgres 9.6.19, which you can learn by sending
SELECT VERSION()as your query instead of asking about public health data.
If you want to experience getting cold emails from recruiters without actually receiving email, just click the button!
About a week and a half ago I got an email from the electric company called “Energy Use Alert” letting me know that so far I had used $5.18 worth of electricity during the billing cycle and that my monthly bill was projected to be somewhere between $8.80 and $11.90. This email included a helpful tip: “Want to lower your bill? Reduce your electricity use during On-Peak hours.”
I have a ground-based solar array with 26 panels, and they are filthy. I suspected that washing the solar panels would do more for my electric bill than avoiding using appliances at dinner time.
A quick review of the internet told me that solar panel washing is a scam. You should not pay someone to wash your solar panels because it does not help. Washing solar panels leads to a 1-2% improvement, maybe 3% if they are really dirty.
This seemed preposterous to me, as there is very obviously a whole bunch of opaque dirt and bird crap between my solar panels and the sun. I see a daily dip in production when the earth arranges itself so that there is a tree between the solar panels and the sun, and the tree is not particularly large nor particularly near the solar panels. Temporary tree-shade is a bigger problem than persistent dirt?
This, of course, meant that I needed to test my hypothesis. I got some distilled water and a window-washing doohickey and started washing half the panels. Sadly, I am not tall enough to reach all of them, so I ended up washing almost a third of the panels.
In order to compare the performance of the washed ones to the unwashed ones, I did my split as a checkerboard because it would be a bad idea to only wash the ones that are shaded by the tree. I washed them on June 30, so the July report for energy produced to date is going to give a good sense of the difference in production between the clean panels and the dirty ones. I’ve circled the clean ones.
The sample size is small enough that I’m not going to bother to calculate a confidence interval for how much better the clean ones are doing than the dirty ones. Back of the envelope, I’m pretty comfortable saying that I’m seeing a roughly 5% improvement in production due to cleaning. The best improvment from dirty -> clean come from the part of the array that doesn’t get shaded by the tree.
Like many good A/B tests, I have a pretty strong takeaway here: I need to get out the ladder and clean the rest of the solar panels.
subscribe via RSS