Learning to Code

Sun, Mar 10, 2013

One of my secret vices is reading polemics about whether or not some group of people, usually humanists or librarians, should learn how to code. What’s meant by “to code” in these discussions varies quite a lot. Sometimes it’s a markup language. More frequently it’s an interpreted language (usually python or ruby). I have yet to come across an argument for why a humanist should learn how to allocate memory and keep track of pointers in C, or master the algorithms and data structures in this typical introductory computer science textbook; but I’m sure they’re out there.

I could easily imagine someone in game studies wanting to learn how to program games in their original environment, such as 6502 assembly, for example. A good materialist impulse, such as learning how to work a printing press or bind a book, should never be discouraged. But what about scholars who have an interest in digital media, electronic editing, or text mining? The skeptical argument here points out that there are existing tools for all of these activities, and the wise and conscientious scholar will seek those out rather than wasting time reinventing an inferior product.

This argument is very persuasive, but it doesn’t survive contact with the realities of today’s text-mining and machine-learning environment. I developed a strong interest in these areas several months ago (and have posted about little else since, sadly enough), even to the point where I went to an NEH seminar on topic modeling hosted by the fine folks at the MITH. One of the informative lectures recommended that anyone serious about pursuing topic modeling projects learn the statistical programming language R and a scripting language such as python. This came as of little surprise to me as being reassured later in the evening by a dinner companion that Southerners were of course discriminated against in academia. I had begun working with topic-modeling in R packages, and a great deal of text-munging was required to assemble the topic output in a legible format. MALLET makes this easier, but there’s no existing GUI solution* for visualizing the topics (or creating browsers of them, which some feel is more useful**).

Whatever flexibility that being able to dispense with existing solutions might offer you is more than counterbalanced by the unforgiving exactitude and provincial scrupulousness of programming languages, which manifestly avoid all but the most literal interpretations and cause limitless suffering for those foolish or masochistic enough to use them. These countless frustrations inevitably lead to undue pride in overcoming them, which lead people (or at least me) to replace a more rational regret over lost time with the temporary confidence of (almost always Pyrrhic) victory.

An optimistic assessment of the future of computation is that interfaces will become sophisticated enough to eliminate the need for almost anyone other than hobbyists to program a computer. Much research in artificial intelligence (and much of the most promising results as I understand them) has been in training computers to program themselves. Functional programming languages, to my untutored eye and heavily imperative mindset, already seem to train their programmers to think in a certain way. The correct syntax is the correct solution, in other words; and how far can it be from that notable efficiency to having the computer synthesize the necessary solutions to any technical difficulty or algorithmic refinement itself? (These last comments are somewhat facetious, though the promise of autoevolution was at the heart of cybernetics and related computational enthusiasms—the recent English translation of Lem’s Summa Technologiae is an interesting source here as is Lem’s “Golem XIV.”)

I can’t help but note that several of the arguments I’ve read that advise people not to learn to code and not to spend time teaching other people how to if you happen to be unlucky enough to be in a position to do so are written by people who make it clear that they themselves know how. (I’m thinking here in particular of Brian Lennon, with whom I’ve had several discussions about these matters on twitter and also David Golumbia.) Though I don’t think this myself, I could see how someone might describe this stance as obscurantist. (It’s probably a matter of ethos and also perhaps a dislike of people who exaggerate their technical accomplishments and abilities in front of audiences who don’t know any better—if you could concede that such things could exist in the DH community.)

*Paper Machines, though I haven’t tried it out, can now import and work with DfR requests. This may include topic modeling functionality as well.

**I have to admit that casual analysis (or, exacting scrutiny) of my server logs reveals that absolutely no one finds these topic browsers worth more than a few seconds’ interest. I haven’t yet figured out if this is because they are objectively uninteresting or if users miss the links because the style sheet. (Or both.)

Comments

There were some useful comments on this post, which I have reproduced below:

Brian Lennon 3/10/2013 at 1:31 pm

Thanks for the thoughts. My own position on this issue is shaped by questions of ideology, economic opportunism, and how the university’s relations with other social and economic bodies and polities are imagined — boldly or cravenly as the case may be — as well as smaller things like the disciplinary histories of philology, comparative literature, the self-differentiation of “creative writing” within English studies, and so on. I don’t really see a point elaborating those questions here, since I don’t see in your work any desire to trade on technical competencies for advantage in the intra-disciplinary conflicts that I think are really at issue here (in, say, a nativist, quietist, neo- or crypto-positivist counter-revolution against big, bad, continentally foreign, continentally political “Theory,” or against the anti- and post-colonial provincializing of Europe, or against the U.S. demographic inversion represented by the many categories of “minority” studies).

Who does have that desire? About 47% of the U.S. humanist professoriate, I would say. Unfortunately for them, only about 1% of that group have the technical competencies to trade on. But precisely in that context, the finitude of lived human time means that even very low-frequency boasting-bullying about practical “coding” installs the humanist who already has such ability permanently at the peak of a pyramid of competence in technical skills that very rapidly become obsolete, but which in a “crisis” like that following 2001 and especially 2007-2008 align with what panicky political short-termists, and an understandably confused and traumatized “public,” want to think higher education should provide.

That opportunism is what I dislike in others, and refuse for myself — though I would not even have a solid basis for it, myself, without taking at least a few years, full time, to ground and organize the bits and pieces of knowledge that have come to me ad hoc (I’d say “essayistically”) while I was pursuing the more disciplined acquisition of what I personally consider genuine humanist expertise, in intellectual history and its human languages. (In that sense, at least, “ethos” is precisely the right word.)

I don’t in fact see any of our humanist colleagues exaggerating their technical accomplishments. But I think there certainly is in “DH” an implicit boasting-bullying about accomplishments that are simply not technically impressive at all, calculatedly performed in a “humanist” context characterized by very low average technical competency and — more importantly, in my view — a combination of real vulnerability and elected cowardice in so-called public relations.

To your other point: it’s not at all difficult for me to imagine the competencies you’ve acquired being attractively packaged for end users and centrally integrated into JSTOR, etc. within a decade or even much, much sooner. (That’s vulgar paraphrase of David Golumbia’s point here: http://www.uncomputing.org/?p=206 .) From a broadly “philological” standpoint (which is my own standpoint, and which is not at all merely “materialist,” good or otherwise), that could hardly mean that you wasted your time acquiring them; but on the other hand, I’ll wager that the point I understand you to be making with the phrase “doesn’t survive contact with the realities of today’s text-mining and machine-learning environment” will not age well, for just that reason. To take your own example, Paper Machines is an admirable piece of work, very easy to install and very easy to use — certainly easy enough so that scholars with no knowledge of the techniques and algorithms involved can judge its usefulness for themselves, in relation to their own work, and can use it to test any claims that might be made for the novel or non-novel utility of those results, and thus for the long-term value of the scholarly labor that produced them. (Here, on standards of evaluation in the humanities, which I think looms as a reckoning for so-called “digital humanists,” I’d refer readers to Natalia Cecire, as well: http://nataliacecire.blogspot.com/2013/03/still-further-beyond-cave.html .)

Such claims may not, today, seem anywhere near as silly as the claims made for investment in HTML competency, in the mid- to late 1990s (http://blogs.princeton.edu/librarian/2013/03/why-i-ignore-gurus-sherpas-ninjas-mavens-and-other-sages/ ), or as silly as claims made in the later 2000s for investment in a deliberately forgiving interpreted scripting language like Python. But in so far as the adjusted relative level of technical competency among humanists in 2013 is probably not much higher than it was in 1995, we are talking about effects of scale in which time cycles as much as it flows.

Jonathan 3/10/2013 at 3:09 pm

I had intended to respond to Natalia Cecire’s intriguing piece when I began writing, but I felt that material should be saved for another post. One point is that I can’t think of a topic that digital humanists have written more about than the necessity of convincing p&t committees, deans, and provosts that non-print publication is valuable and that digital projects count as research. And then it becomes “what is a project?”, or “why is a project or a tool the only unit of value?”, such as the discussion of the Short Guide to the Digital Humanities quickly focused on.

She asks why people aren’t talking about what makes good projects interesting or valuable, and then lists several examples. That would also mean talking about what makes bad projects uninteresting and not valuable, I would think. And that leads to the institutionalized niceness (disguising savagery and the will-to-power as you and David have pointed out on several occasions) and consequent utopian hopes for academic reform invested in all things digital. I haven’t worked out what I think about this yet.

I’m much more likely to engage in a neo-positivist counter-revolution than any other kind, however, as I have almost never heard anyone engaged in a ritual denunciation of positivism be clear about what variety of it they mean (Comte, Carnap, anything ‘scientific,’ ‘empiricist,’ or ‘formalist’) or if there have ever been legitimately positivist methods employed in literary analysis. (What would they be? Nothing as theory-laden as 19th C philology could call itself positivist, could it?)

An algorithmic criticism that settles once and for all cruxes in Keats or other turns of the screw is a useful Borgesian thought-experiment, I maintain, and I’m sure someone’s working on it. It was interesting to me how quickly Stephen Ramsay’s Reading Machines turned to Oulipo and pataphysics, for example.

Not every humanist with rapidly obsolete technical skills leverages them into any type of actual or cultural capital, as I can tell you from direct experience. Factors such as institutional affiliation, capacity for self-promotion, travel funding, children, and salary bear directly on whether or not you can travel to the conferences and colloquia where the important connections in the unusually insular DH community are made and maintained. Other sub-disciplines are like this too, though they don’t generally advertise themselves as being open and transformational. (I realize this may sound bitter, but I intend it to be wistful and bemused.)

While your 47% figure might not be a deliberate exaggeration, the 1% one certainly is, as I think I pointed out on twitter. But again, it’s only certain people who leverage their technical skills–whatever they may be–in moments of crisis.

Paper Machines was not easy to use a few months ago. At least not in the sense that interested graduate students could reliably install it on their computers without having to fiddle with having the correct version of python and a host of other dependencies (if they were using say Snow Leopard instead of a more recent version of the Mac OS, and I don’t even want to think about what it was like on a PC). But those details are easy to sort out, and it sounds like it has been. The regular LDA algorithm is usable and the results visualizable for just about anyone already. My point about the machine-learning environment still holds, however, if you are interested in LDA-variants or other algorithms, which generally have to be compiled from source or even coded from algorithms in some cases. But, to be fair, there are very few good reasons for a humanist to want to do this.

Brian Lennon 3/10/2013 at 6:31 pm

“That would also mean talking about what makes bad projects uninteresting and not valuable, I would think.” Indeed. That won’t be a comfortable moment, given the investments in what Natalia calls “virtue.”

Also, I do recall that particular Twitter remark, and thinking that you were right.

As for positivism, I usually specify “Comtean and Marxist” — but perhaps more to defer the real historical specificity that you’re right to suggest should be brought to bear. I find Pieter Verburg’s Language and its Functions useful, less for clarity than for suggestive and entertaining narration of the history of conflict between “humanism” (as a “lingualism”) and “positivism” (as well as what he calls “axiomatic rationalism,” “scientism” or “scientialism,” etc.) in philological intellectual history, which is the history I’m interested in. (19C philology certainly was tempted by “positivism,” especially in the work of Bopp: Verburg says we should hold that against him, even though Bopp is otherwise the book’s hero.)

Verburg’s specification is “scientific certainty in the form of the exceptionless application of biological and physical laws” and — more importantly, in my view — in their colonizing extension to other domains of knowledge. Which sounds about right to me, if you take it not as a body of propositions for interminable debate but as one of several possible tokens (some embraced, some perhaps imposed) for an ideologically competitive hostility to “metaphysics” and investment in the cultural authority of mathematics and logic, plus scientific method, against the cultural authority of their rivals, religion and secular humanism.

In that sense, I think “ritual denunciations,” while imprecise, have a political content that isn’t exhausted by the demand for specificity, even when that demand is not met. We can call it another name, or specify it so as to be both more historically responsible and less inflammatory to those of our colleagues who, perhaps in a revelation of the political unconscious, seem to instantly identify themselves as its targets — but the opportunism of the applied “culture” that triumphed with the Second World War, and which only the intellectual movements of the 1960s have really resisted, since then, was and is real.

David Golumbia 3/10/2013 at 9:22 pm

A little pressed for time, so I’ll be sketchier than I’d like. I appreciate both the post and the discussion, and as both of you know I’m very sympathetic with the institutional and political issues Brian raises–they are strategic and I believe that much more about DH is strategic than the everyday discussions would suggest.

But to keep here to the tactical, I’ll pick up on three points:

you rightly point out and pay significant attention to the fact that “What’s meant by ‘to code’ in these discussions varies quite a lot.” I think this may be the most overlooked part of these conversations. Very often–in fact I’d say a majority of the time–what it comes down to is a basic understanding of programming; in fact, in a comment on one blog or another, Stephen Ramsay indicates that what he teaches (I think even in his PhD coding classes) remains prior to Computer Science 101, a class which itself presumes basic knowledge of programming and algorithms. It seems to me that this truly is something that EVERYONE should learn, and preferably learn it in high school. While I know many high schools teach it, not all do; the same for colleges. So if what is meant by “humanists should learn to code” is roughly the same as this basic knowledge (which could be taught in any procedural or interpreted language, but probably not markup per se), I’d only agree in so far as I think this should be a part of every student’s education. I would even like (and believe this is often the case) for basic notions of markup to be taught in these relatively elementary environments: let’s say, “algorithmic thinking and basic markup principles” strike me as basic parts of any education today, as much as just about anything else you could name. (One could also cite Rushkoff’s Program or Be Programmed here, which is very vague about just what level/kind of programming he means, and often comes down to this kind of basic education in algorithmic thinking–so that we understand in principle how computational stuff works–but to my mind fails to show how being a true programmer impacts any user’s experience with Facebook or Twitter or any of the other major software behemoths Rushkoff worries about.)
What worries me most, of course, is the nature–beyond what I’ve just said–of the imperative “should.” I support, beyond doubt, certain imperatives: I think English PhD students should have a basic grasp of the periods and genres and theoretical frameworks applicable to literature in English. I honestly think they should have a reading knowledge of one or even two foreign (human/natural) languages. I don’t think that they should know how to code as a required part of an English (or even Comp Lit) PhD. (The question is trickier for straight Digital Humanities PhDs.) That doesn’t mean coding is unwelcome or irrelevant–far from it–but to my mind it’s an option (as many other things are options) rather than a requirement.
This leaves the question of whether coding “should” be a part of the educational toolkit within English/Comp Lit Departments (again, I’m leaving aside straight DH units, as they are too interdisciplinary to be characterized easily, and don’t impact the field I happen to care about a lot). My view is that this has been oversold to some degree. I don’t mind at all special-case classes (the best example being TEI, which makes sense being taught in literary studies), and I don’t mind a certain amount of coding-as-a-kind-of-writing being used in a composition context; but other than that I would prefer bridge-building with the appropriate campus (and off-campus) professionals for whom coding is their primary enterprise. As I’ve written in other contexts, the most useful model to me seems to come from Computational Linguistics, and this is what they do: a few special-purpose classes devoted to the topic, and a strong reliance on existing CS/EE programs for people who really want to get their hands dirty.

I admire you and Brian and Stephen and other folks who’ve taught themselves R (I presume that’s how you got it), and I’d encourage others to do so, whether on their own or in classes; I just chafe at the notion that anyone other than CS or EE grads must learn how to code–again, though, with the proviso of my item #1. I am interested in results much more than how one arrived at the results, for computational humanities research the same as all other humanities research.

Jonathan 3/10/2013 at 9:40 pm

I think it’s much more honorable, if you think that people are doing bad work, to say it outright—name names and give reasons. I see some of this, but insinuation and innuendo are much more common. Of course most of these comments I see on twitter, which by definition lacks context.

I did not know of Verburg’s work, though I can’t help but wonder what he would made of Cartesian Linguistics and the rest of Chomsky’s ideas, combining as they do rationalism with a distrust of (positivist) methods of description and classification.

The phrase “secular humanism” evokes for many Dawkins and his ilk, who are firmly on the other side of that equation, though I do not know if that was an intentional echo on your part.

I think Chomsky is a useful example for thinking about the culture you mention and the reactions against it. Clearly the author of American Power and the New Mandarins was part of an intellectual resistance, but then there is the rationalism, the formalization and mathematization of linguistics, and MIT–>Pentagon, etc. David Golumbia’s book has a provocative reading of Chomsky’s intellectual legacy that, while I found to be one of the very few things written by a critical humanist that had engaged with his ideas in any depth, I still ended up disagreeing with almost completely. I want to write more about his book at some point, so maybe I can take this thread up then.

Brian Lennon 3/11/2013 at 12:24 pm

This is an assumption, to be sure, but I think we’re working with very different frames and ranges of reference here. Verburg is not, to the best of my memory, especially interested in Chomsky — like Eco’s The Search for the Perfect Language, his is a sweeping book about the history of the imagination of language in “the West,” in which 20C and contemporary linguistics are only that — 20C and contemporary linguistics. I’m not especially interested in Chomsky either, for the same reason, though of course Chomsky is a central figure in a postwar North American frame of reference taken on its own terms.

And by “secular humanism” I mean firstly what Verburg means: three waves, beginning in 15C Italy with Valla and Bruni and in the north with Erasmus, Ramus et al.; continuing in Goethe, Herder, Humboldt et al.; perhaps ending with Nietzsche and his contemporaries.

Secondly, and more importantly, I mean what Said means by it, drawing on Vico and combining it with what he derived from Gramsci and (selectively) from Foucault: “secular criticism” as and in the critique of Orientalism. And I also mean what people like Bruce Robbins, Aamir Mufti and Paul Bové, among others, have done with Said’s legacy. (The new issue of boundary 2 just out, on the topic of post-secularism, is a useful reprise.)

But of course I have enjoyed David’s writing about Chomsky as well, and would also look forward to reading whatever you might write about it. What I’m saying is that apart from one common reference point in the postwar (post-WWII) period, I think we are working with very different ranges of historical focus, at least at the moment.

Jonathan 3/13/2013 at 5:46 pm

I browsed around the Verburg book, and there was almost nothing about Chomsky in it, which I agree is not surprising given Verburg’s frame-of-reference. But Cartesian Linguistics, a book described by Ernest Gellner as “irresponsible ancestor-snatching,” is a useful example of Chomsky examining some of the same historical territory from his own theoretical perspective.

I am familiar with the senses of secular humanism you describe, but the incompatible doctrines also associated with that term I thought were amusing to note given the earlier discussion of positivism. I mean, who could be more positivist in the sense you describe above than Richard Dawkins?

I hope to respond later to David’s points about algorithmic/programming literacy and how that relates to the “learning to code” imperative.