The Stronghold of Bioinformatics

No one likes gamification or MOOCs, as far as I can tell. What I should say is that anyone trained in the hermeneutics of suspicion might even find it hard to accept their existence. It’s hard to come up with a hypothetical concept that would cry more piteously to the heavens for critique, for example. True to form, until a few weeks ago I had never earned a badge in my life and would have regarded the prospect of doing so with contempt and a touch of pity for whoever was naive enough to suggest it.

Topics in Theory

After experimenting with topic models of Critical Inquiry, I thought it would be interesting to collect several of the theoretical journals that JSTOR has in their collection and run the model on a bigger collection with more topics to see how the algorithm would chart developments in theory. I downloaded all of the articles (word-frequency data for each article, that is) in New Literary History, Critical Inquiry, boundary 2, Diacritics, Cultural Critique, and Social Text.

Same Stuff, Different Graph

When I started experimenting with graphing changes in topic-proportions over time, I didn’t pay much attention to the design of the graph. I could see that it was far too busy, but I assumed that this would be relatively easy to adjust using ggplot2’s many parameters. It wasn’t. It didn’t take me too long to figure out that I needed to change the data from discrete to continuous in order to see anything like a sparkline, but it was also apparent from the other data sets I was working with that taking the mean at intervals was the only way to make a reasonably clean graph.

Visualizing Topics in ELH

I was impressed with Ian Milligan’s visualizations of Canadian parliamentary debates, and I wanted to try to visualize some of the topic models I’ve been creating from JSTOR’s Data for Research. ELH I thought would be an interesting journal to try, as it publishes articles in each issue on quite a range of literary periods, often ranging from medieval to twentieth-century material. I assumed that LDA would be likely to identify each of these periods as a topic.

Experimenting with Dynamic Topic Models

When I first began reading about topic modeling, I very much wanted to experiment with “dynamic” topic modeling, or the tracking of changes in topics over time. David Blei and John Lafferty describe their algorithm in this paper. They also have made a dynamic topic model browser of Science available. I was very impressed with this project and wanted to apply the technique to journals in the humanities using JSTOR’s Data for Research (DfR).

Creating Topic Models with JSTOR's Data for Research (DfR)

Here are some instructions for creating the same types of topic models of JSTOR’s journals that I did with Critical Inquiry and Signs. These instructions are designed for someone using a Mac or Linux platform. (The differences below between using Linux and a Mac should be apparent to anyone who uses Linux, so I’m not going to indicate them here; it’s mainly where files are stored.) All of this should work on Windows, but you’ll need to install Cygwin or use alternate shell commands.

Two Critical Inquiry Topic Models

Here are two topic models of Critical Inquiry, generated with the same algorithm but different implementations (MALLET and R topicmodels package, slightly different stopword lists, the latter also was generated with a minimum word frequency of seven): 0 black music musical white african jazz sound performance american racial negro song cultural sounds race rap cage singer composer 1 meaning theory interpretation question philosophy language point claim philosophical sense truth fact argument knowledge intention metaphor text account speech 2 american duke james john trans culture life william things modern cambridge michael david robert soviet shame henry york objects 3 trans time question subject derrida language place order object relation word thing reading moment longer things work thought writing 4 history historical narrative discourse account contemporary terms status context social ways relation discussion essay sense form representation specific position 5 god christian history religious greek ancient modern tradition divine century body early philosophy latin nature religion medieval church soul 6 film cinema films camera screen images frame image movie theater shot early visual cinematic narrative kiss hollywood scene documentary 7 science scientific human knowledge media theory sciences natural studies life social history technology communication machine humanities disciplines system psychology 8 body time game process space affect play form motion hand ways level attention bodies turn making figure physical parts 9 political social politics cultural power culture theory society ideology critique intellectual ideological state economic class liberal struggle revolution marx 10 law legal public case justice trial political war court state violence rights states moral crime speech abuse slave united 11 art painting work visual image picture artist images fig paintings works artists artistic photography aesthetic museum photograph objects object 12 form work time terms art nature individual structure order reality analysis general style works experience concept process elements theory 13 literary literature criticism text reading book work critics texts writing reader fiction language english author readers works read critic 14 life human moral good man sense great experience work fact kind find personal idea mind character people view social 15 time years people day great long house young called read book wrote man word times year men left english 16 story love death man dead face life eyes point scene narrative moment long real james stories heart narrator characters 17 italian di del fig della il fascist spanish inca ii italy autumn giovanni st saint che text building verdi 18 german history benjamin trans historical freud von germany art modern das memory panofsky essay berlin early hegel walter war 19 public war time national city american education work social economic space people urban culture corporate building united market business 20 poetry poem poet language poems poetic poets english lines literary lyric word romantic text verse poetics prose pound milton 21 women sexual female woman male feminist desire sex men sexuality mother gender freud identity body psychoanalytic child psychoanalysis feminine 22 jewish jews israel israeli palestinian state arab jew religious land people political religion identity palestinians muslim islamic rabbi al 23 french en france title sur qui dans paris une paul text ne foucault letter est man jean derrida au 24 cultural european culture colonial western american national chinese african indian native english identity white british racial race south africa

Topic Modeling Signs

Natalia Cecire tweeted during the topic-modeling workshop that she was momentarily excited by thinking that a presentation on the journal Science was on Signs: Journal of Women in Culture and Society. As it turns out, I have been experimenting with creating topic models from JSTOR’s Data for Research, and I decided to see what the Signs corpus would come up with. I downloaded word-frequency data for all the issues of the journal.

Errol Morris’s A Wilderness of Error The Trials of Jeffrey MacDonald

I am from North Carolina. I’m quite familiar with the eastern part of the state, having lived there off and on for almost a quarter-century. Nothing surprised me more in this unusual book than learning there was apparently a thriving “hippie” scene in Fayetteville in 1970. It seems unimaginable from what I experienced, but the returning military from SE Asia, heroin, etc. dynamic was quite different from anything I remember. Anyway, while I was familiar with the broad outlines of the Jeffrey MacDonald case, I have never read any of the books about it (or seen the mini-series or any of the other documentaries).

LibraryThing Ownership Relative to Total Sales

The Guardian recently posted some sales data of the Booker Prize winners. I thought it would be interesting to compare those figures with LibraryThing ownership to see how reliable that latter figure might be in determining a book’s total sales. The median was 2.77%, mean 3.88%. The table is below, not very well-formatted I’m afraid. 1969 PH Newby Something To Answer For Faber & Faber 421 64 15.20% 1970 Bernice Rubens The Elected Member Eyre & Spottiswoode 3,901 133 3.