Richard Ristow...contd.
Previous |  1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  Next
We shared tea and croissants at the Seven Stars Bakery in Providence

King Douglas: What makes a problem difficult?

Richard Ristow: A fuzzy notion of the problem, of course. I've often posted with a deduction about what the poster really wanted. It can be fun as an exercise, but it's certainly harder. The worse the data is organized, the harder. That's especially true when the answer depends on many variables or many cases, and the grouping of variables, or the keying of cases, isn't clear. Some effects, like many-to-many merges, SPSS simply does poorly. Sometimes I'm out of my depth, statistically, and then I say so.

King Douglas: Many problems can be solved in a variety of ways. Do you value parsimonious code and work to attain it? Or is parsimony irrelevant given that a computer is going to do the work?

Richard Ristow: I don't go for the shortest code, by any means. Certainly not the fewest lines in the listing. I write to be read, and I use a lot of space to make it more readable. Comments, of course. Indenting code within a control structure (in SPSS, the pseudo-indent, starting a statement with a period and then indenting). I'll often put the logical parts of a statement on separate lines. I'll declare variables NUMERIC before computing them, and assign labels where that makes sense.

But I try not to use statements, or computing time, that isn't necessary; and I like a single, clear flow of computation. For example, if I INDEX to find the location of a substring, and need that location several times, I'll assign it to a scratch variable rather than putting the INDEX expression in each time. It's an invisible speed advantage, but it's cleaner: the logic is in one place, only one place you need to understand or to fix error in. And it's easier to read: you see a reference to a quantity you know, rather than an expression you have to check to see if it's the same.

I prefer looping code -- DO REPEAT or LOOP -- to separate parallel statements. I'll sometimes do that even if the result takes just as many lines, because I think it's clearer and neater.

I like to use a system according to its own logic. When there's a nice tool like RECODE, I like to learn all its tricks and get the most out of it.

SPSS is built around the implicit loop through the records; I like to get all I can out of that, before writing an explicit loop. It's one reason to be a fan of 'long' over 'wide' data organization.

Big wastes bother me. You've seen me write, time and again, about how often EXECUTE is misused.

King Douglas: Have you ever seen a solution to a problem that you thought was particularly elegant, poetic or beautiful?...speaking here of SPSS code.

Richard Ristow: Well, I have a very pronounced style, so I tend to like my own code. I remember a cute little loop to find the longest run of consecutive zeroes in a list of variables.

I once posted a note to the SPSSX list, complaining that AUTORECODE gives you the results, but doesn't generate code you can then modify. Raynald Levesque posted a very sweet solution using AGGREGATE.

Some of Jan Spousta's is excellent programming, what I'd call excellent old-school: compact, perhaps not easy to read, surprising and elegant techniques. I use SUBSTR on the left of an assignment a lot more, because of following Jan's example.

Previous |  1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  Next

TOP