All About Blackboard Collaborate

I spent the weekend playing with Blackboard Collaborate and successfully held a Y1 tutorial and Y3 project group meeting in it. It was fine!

t can work well for lectures, allowing for streaming your lecture with a recording done from anywhere (no need to be in a lecture theatre). The recording can be downloaded.

Students just need a reasonable browser and decent connection. If they can Skype, they can do this.

They don’t need to be logged in to Backboard to join in.

I’ve made a starter video, to show you how to expose it:

More coming! I’ll add them here.

I can give you access to a session for you to play around with it and can do demos. Email me (UoM staff and students first, please).

Week 5

This is the last week of period 1. On the one hand, yay! Odds of surviving seem high. On the other, I haven’t survived it yet. And of course I have to give a PGR open day talk tomorrow. And write exams. And grade coursework. Etc.

I don’t know if it’d be easier if I weren’t sick but it would be less uncomfortable.

Losing my Saturday wasn’t helpful!

I make, for most people, a ton of money. My work is prestigious and largely fulfilling and interesting. So many people work at brutalising jobs with little security and no future. I had moments where I feared that fate though my high socialist-economic status protected me more than I feared it would.

In countries like the US and UK this is largely a voluntary societal choice. We could easily do better. Indeed, much much better.

Perhaps we will! We’re having a bad time with right wing wreckers holding far too much power, but they can be beaten.

Update: On the right principle of “don’t be like Elon Musk,” I wonder about this post. My fundamental point is the anxiety which makes a moderately stressful, low physical-effort workload into something that makes me feel terrible. Writing about it helps a bit and I hope sharing can help some people with similar anxiety.

Same Day Grading

Or near same day.

One reason I started adding MCQ quizzes to my coursework is the fast turn around. This year I aimed to get all my Software Engineering coursework back by the end of the class where it was due. Part is done by “automated” grading but part was organizing my TAs to grade the short essays quickly.

They rose to the occasion. All 4 did this set rather than the normal 2 so they had 12-15 each. They sat together for cross validation. They only used a rubric (no manual comments) but were at the late lab for discussion.

The students had a morning lab where they tried to apply the rubric to their own essay and a partners.

This seems nigh perfect. They got a lot of help before the next essay. There was no “wasted” grading effort (eg ignored feedback). And the TAs are free! No grading hanging over their heads.

I’m happy! Wins all around.

Making Principled Unprincipled Choices

I like principled decision making. Indeed, few things inspired me as much as this quote from Leibniz:

if controversies were to arise, there would be be no more need of disputation between two philosophers than between two calculators. For it would suffice for them to take their pencils in their hands and to sit down at the abacus, and say to each other (and if they so wish also to a friend called to help): Let us calculate.

Alas, there’s no decision making situation where this vision holds, even in principle. But still, I like my decisions to conform to some articulable rationale, preferably in the form of some set of general rules.

But some of my rules are meta-rules which focus on resource use. Obviously, one goal of decision making rules in to maximise the chances of making the “right” choice. But for any metric of rightness (let’s say, an appliance with the best value for money) there’s a cost in the effort to assure the maximum (e.g., research, testing, comparing…lots of shopping). That cost can be quite large and interact with subsequent satisfaction in a variety of ways. I’m prone to this and, indeed, end up in decision paralysis.

In response to this, one of my meta-rules is “don’t over-sweat it”. So, for small stuff, this reduces to “don’t sweat the small stuff”. But, because of my anxiety structures, I tend to see certain classes of small stuff as big stuff. So, I dedicate some effort to seeing small stuff as small. Sometimes, this means making it invisible to me. Poor Zoe often has to make the actual purchase after I’ve done the research, or even make the decision after I’ve done the research. For various classes of minor, irrevocable sub-optimal decisions, I prefer not to know about them. I will obsess, and that doesn’t help anyone.

When the decision is essentially arbitrary (because all choices are incommensurable in toto, or their value is unknowable at the moment), I try to make myself flip a coin (metaphorically, at least). What I try to avoid is building a fake rationale (except when that enables the choosing or makes me happier with the arbitrary choice).

Technical (or teaching) decisions often are best treated as arbitrary, but we have tons of incentives to treat them as requiring a ton of analysis to make the “right” choice. At the moment, I’m evaluating what Python testing framework to use and teach in my software engineering class. I currently use doctest and unittest and have a pretty decent lesson plan around them. doctest is funky and unittest is bog standard. I’d consider dropping doctest because I need room and we don’t do enough xUnit style testing for them to really grasp it. They are also built into the standard library.

But then there’s pytest which seem fairly popular. It has some technical advantages, including a slew of plugins (including for regression testing and BDD style testing). It scales in complexity nicely…you can just write a test function and you’re done.

But, of course, it’s a third party thing and needs to be installed. Any plugins would have to be installed. Is it “better enough” to ignore the built in libraries? Or should I add it on with the builtin libraries? AND THERE MIGHT BE SOMETHING YET BETTER OUT THERE OH NOES!!!!

No. The key principle here is a meta-principle: Don’t invest too much more effort. Make a decision and stick with it. In the end, any of the choices will do and a big determiner will be “does it spark my interest now?” while the other will be “how much extra work is that?”

And that’s fine.


Beyond JSON

JSON pretty clearly won and won big. This is perhaps inevitable given the dominance of Javascript. And that’s ok! JSON is a pretty decent sort of sexpr and having both lists and dicts makes it pretty useful for quicky externalisation of all sorts of data. The friction, in a typical scripting derived language, of manipulating JSON in memory is super low. Slinging dicts and lists is something any Python (or Javascript, or…) programmer is going to find congenial.

But it has some infelicities and I just don’t mean the lack of query and schema languages (which is sorta being addressed). JSON is rather annoying to hand author and doesn’t seem great for documents and document formats. Or even hacking existing documents like HTML…if only because there’s no standard reflection of HTML structure into JSON.

There are some moves to improve this situation.

JSON5 tackles the writability. Probably the biggest move is not having to quote (certain) keys in objects. That helps both reading and writing! For reading, there’s a clear visual difference between key strings and “value” strings. For writing, less quoting!!

The other big one is multi-line strings (with the ‘\’ as the continuation character). Having to have a continuation character sucks, but it’s much better than the status quo ante.

Comments are also a good idea! The rest seem minor, but these definitely make a difference.

Mark Notation is aimed at bringing XML-like structuring and extensibility to JSON. It does this by adding a key syntactic (and semantic!) extension, the Mark object which is a name (thing tag/element name), a set of properties (think attributes, but with complex contents), and a list of content objects (think child content). It builds on JSON5 so has those authoring felicities.

Semantically, Mark objects get mapped into pretty simple Javascript objects. I don’t fully understand this claim:

contents: an ordered list of content objects, which are like child nodes of elements in HTML/XML. Mark utilizes a novel feature of JS that JS object can be array-like. It can store both named properties and indexed properties.

I don’t see why this matters as you have a special Mark object with has an explicit contents variable. Ah, maybe:

properties: can be accessed through markObj.prop or markObj[‘prop’] when prop is not a proper JS identifier. You can also use JS for … in loop to iterate through the properties. Unlike normal JS array, Mark object has been specially constructed so that Mark contents are not enumerable, thus do not appear in for … in loop.
contents: can be accessed through markObj[index]. You can also use JS for … of loop to iterate through the content items.

So you don’t have to do a field access but just can use special loops. I don’t see that this would be painful in, say, Python even with field accessing. I might default to making Python Mark Objects iteratable over the contents (on the theory that that’s more “normal”).

It would be interesting to compare APIs to see whether this really brings JSON like easy of programmer use.

And, of course, there’s YAML, which you can think of as JSON++++. (JSON is a subset these days.) It’s designed from the ground up for writability and capturing complex structures. And that it does. The price is considerably more complexity. Like a ton more. (You can define entity like things (actually, more like IDs) on the fly! Inline!) It has excellent embedded complex multiline strings (essentially “here-strings“).

I have to say that it might be easier to grow Mark Notation a bit toward YAML than the reverse. Here-like-strings plus id references go a long way.


Some Results On Student Evaluations

Erik posted about student evaluations last week and I took the opportunity to bring up the  famous Air Force Academy study. (the paper; an accessible blog post about it).

The setup is extraordinary:

Prior to the start of the freshman academic year, students take course placement exams in mathematics, chemistry, and select foreign languages. Scores on these exams are used to place students into the appropriate starting core courses (i.e., remedial math, Calculus I, Calculus II, etc.). Conditional on course placement, the USAFA registrar employs a stratified random assignment algorithm to place students into sections within each course/semester. The algorithm first assigns all female students evenly throughout all offered sections, then places male-recruited athletes, and then assigns all remaining students. Within each group (i.e., female, male athlete, and all remaining males), assignments are random with respect to academic ability and professor.12 Thus, students throughout their 4 years of study have no ability to choose their professors in required core courses. Faculty members teaching the same course use an identical syllabus and give the same exams during a common testing period. These institutional characteristics assure that there is no self-selection of students into (or out of) courses or toward certain professors.

They focused on math classes because the grades are highly normalised:

The integrity of our results depends on the percentage of points earned in core courses being a consistent measure of relative achievement across students. The manner in which student scores are determined at USAFAparticularly in the Math Department, allows us to rule out potential mechanisms for our results. Math professors grade only a small proportion of their own students’ exams, vastly reducing the ability of “easy” or “hard” grading professors to affect their students’ scores. All math exams are jointly graded by all professors teaching the course during that semester in “grading parties,” where Professor A grades question 1 and Professor B grades question 2 for all students taking the course. These aspects of grading allow us to rule out the possibility that professors have varying grading standards for equal student performance. Hence, our results are likely driven by the manner in which the course is taught by each professor.

In some core courses at USAFA, 5–10 percent of the overall course grade is earned by professor/section-specific quizzes and/or class participation. However, for the period of our study, the introductory calculus course at USAFA did not allow for any professor-specific assignments or quizzes. Thus, potential “bleeding heart” professors had no discretion to boost grades or to keep their students from failing their courses. For this reason, we present results in this study for the introductory calculus course and follow-on courses that require introductory calculus as a prerequisite.

This is really amazing! It’s pretty close to a highly controlled experiment!

They found a pretty strong effect from instructor quality:

The USAFA’s comprehensive core curriculum provides a unique opportunity to test how introductory course professors affect follow-on course achievement free from selection bias. The estimate of 2 Var (lj 1) is shown in row 2, column 2 of table 4 and indicates that introductory course professors significantly affect follow-on course achievement.19 The variance in follow-on course value-added is estimated to be 0.0025 (SD p 0.050). The magnitude of this effect is roughly equivalent to that estimated in the contemporaneous course and indicates that a one standard-deviation change in introductory professor quality results in a 0.05-standard-deviation change in follow-on course achievement.

The striking bit (for me) is their examination of student evaluation and grades:

Next, we examine the relationship between student evaluations of professors and student academic achievement as in Weinberg, Hashimoto, and Fleisher (2009). This analysis gives us a unique opportunity to compare the relationship between value-added models (currently used to measure primary and secondary teacher quality) and student evaluations (currently used to measure postsecondary teacher quality).

…In column 1, results for contemporaneous value-added are positive and statistically significant at the.05 level for scores on all six student evaluation questions. In contrast, results in column 2 for follow-on course value-added show that all six coefficients are negative, with three significant at the .05 level and three significant at the .10 level .Since proposals for teacher merit pay are often based on contemporaneous teacher value-added, we examine rank orders between our professor value-added estimates and student evaluation scores. We compute rank orders of career average student evaluation data for the question, “The instructor’s effectiveness in facilitating my learning in the course was,” by professor… As an illustration, the calculus professor in our sample 1 who ranks dead last in deep learning ranks sixth and seventh best in student evaluations and contemporaneous value-added, respectively.

Our findings show that introductory calculus professors significantly affect student achievement in both the contemporaneous course being taught and the follow-on related curriculum. However, these methodologies yield very different conclusions regarding which professors are measured as high quality, depending on the outcome of interest used. We find that less experienced and less qualified professors produce students who perform significantly better in the contemporaneous course being taught, whereas more experienced and highly qualified professors produce students who perform better in the follow-on related curriculum. Owing to the complexities of the education production function, where both students and faculty engage in optimizing behavior, we can only speculate as to the mechanism by which these effects may operate. Similar to elementary and secondary school teachers, who often have advance knowledge of assessment content in high-stakes testing systems, all professors teaching a given course at USAFA have an advance copy of the exam before it is given. Hence, educators in both settings must choose how much time to allocate to tasks that have great value for raising current scores but may have little value for lasting knowledge.

And the key bit:

Regardless of how these effects may operate, our results show that student evaluations reward professors who increase achievement in the contemporaneous course being taught, not those who increase deep learning. Using our various measures of teacher quality to rank-order teachers leads to profoundly different results. Since many U.S. colleges and universities use student evaluations as a measurement of teaching quality for academic promotion and tenure decisions, this finding draws into question the value and accuracy of this practice

Now, in the comment thread, people pointed out that this result, while strong, was narrow (restricted to fairly low level math and “adjacent” courses). These facts are related, of course. By focusing on the data with the fewest confounders, they narrowed the scope but increased the strength. However, there have been subsequent students. For example, “Evaluating students’ evaluations of professors“:

The empirical analysis is based on data for one enrollment cohort of undergraduate students at Bocconi University, an Italian private institution of tertiary education offering degree programs in economics, management, public policy and law. We select the cohort of the 1998/1999 freshmen because it is the only one available where students were randomly allocated to teaching classes for each of their compulsory courses.
The students entering Bocconi in the 1998/1999 academic year were offered 7 different degree programs but only three of them attracted enough students to require the splitting of lectures into more than one class: Management, Economics and Law&Management. Students in these programs were required to take a fixed sequence of compulsory courses that span over the first two years, a good part of their third year and, in a few cases, also their last year.

The exam questions were also the same for all students (within degree program), regardless of their classes. Specifically, one of the teachers in each course (normally a senior faculty member) acted as a coordinator, making sure that all classes progressed similarly during the term and addressing problems that might have arisen. The coordinator also prepared the exam paper, which was administered to all classes. Grading was delegated to the individualteachers, each ofthem marking the papers ofthe students in his/her own class. The coordinator would check that the distributions were similar across classes but grades were not curved, neither across nor within classes.

They also looked at evaluation/grade correlations:

In this section we investigate the relationship between our measures of teaching effectiveness and the evaluations teachers receive from their students. We concentrate on two core items from the evaluation questionnaires, namely overall teaching quality and the overall clarity of the lectures.

The key hit:

Our benchmark class effects are negatively associated with all the items that we consider, suggesting that teachers who are more effective in promoting future performance receive worse evaluations from their students. This relationship is statistically significant for all items (but logistics), and is of sizable magnitude. For example, a one-standard deviation increase in teacher effectiveness reduces the students’ evaluations of overall teaching quality by about 50% of a standard deviation. Such an effect could move a teacher who would otherwise receive a median evaluation down to the 31st percentile of the distribution. Effects of slightly smaller magnitude can be computed for lecturing clarity.

Finally, I did peek at a 2017 meta-analysis which overrides earlier meta-analyses. The abstract:

Student evaluation of teaching (SET) ratings are used to evaluate faculty’s teaching effectiveness based on a widespread belief that students learn more from highly rated professors. The key evidence cited in support of this belief are meta-analyses of multisection studies showing small-to-moderate correlations between SET ratings and student achievement (e.g., Cohen, 1980, 1981; Feldman, 1989). We re-analyzed previously published meta-analyses of the multisection studies and found that their findings were an artifact of small sample sized studies and publication bias. Whereas the small sample sized studies showed large and moderate correlation, the large sample sized studies showed no or only minimal correlation between SET ratings and learning. Our up-to-date meta-analysis of all multisection studies revealed no significant correlations between the SET ratings and learning. These findings suggest that institutions focused on student learning and career success may want to abandon SET ratings as a measure of faculty’s teaching effectiveness.

I mean, it’s brutal:

In combination, our new up-to-date meta-analyses based on nearly 100 multisection studies, as well as our re-analyses of the previous meta-analyses make it clear that the previous reports of “moderate” and “substantial” SET/learning correlations were artifacts of small size study effects. The best evidence − the meta-analyses of SET/learning correlations when prior learning/ability are taken into account − indicates that the SET/learning correlation is zero. Contrary to a multitude of reviews, reports, as well as self-help books aimed at new professors (a few of them quoted above), the simple scatterplots as well as more sophisticated meta-analyses methods indicate that students do not learn more from professors who receive higher SET ratings.

And student evaluatons are a substantial part of the Teaching Evaluation Framework.

So that’s really bad!

When we throw in gender (and other) biases, it seems clear we have a huge problem.

Worse is Better and Back Again

Richard Gabriel, 1991

I and just about every designer of Common Lisp and CLOS has had extreme exposure to the MIT/Stanford style of design. The essence of this style can be captured by the phrase the right thing. To such a designer it is important to get all of the following characteristics right:

  • Simplicity — the design must be simple, both in implementation and interface. It is more important for the interface to be simple than the implementation.
  • Correctness — the design must be correct in all observable aspects. Incorrectness is simply not allowed.
  • Consistency — the design must not be inconsistent. A design is allowed to be slightly less simple and less complete to avoid inconsistency. Consistency is as important as correctness.
  • Completeness — the design must cover as many important situations as is practical. All reasonably expected cases must be covered. Simplicity is not allowed to overly reduce completeness.

I believe most people would agree that these are good characteristics. I will call the use of this philosophy of design the MIT approach Common Lisp (with CLOS) and Scheme represent the MIT approach to design and implementation.

The worse-is-better philosophy is only slightly different:

  • Simplicity — the design must be simple, both in implementation and interface. It is more important for the implementation to be simple than the interface. Simplicity is the most important consideration in a design.
  • Correctness — the design must be correct in all observable aspects. It is slightly better to be simple than correct.
  • Consistency — the design must not be overly inconsistent. Consistency can be sacrificed for simplicity in some cases, but it is better to drop those parts of the design that deal with less common circumstances than to introduce either implementational complexity or inconsistency.
  • Completeness — the design must cover as many important situations as is practical. All reasonably expected cases should be covered. Completeness can be sacrificed in favor of any other quality. In fact, completeness must be sacrificed whenever implementation simplicity is jeopardized. Consistency can be sacrificed to achieve completeness if simplicity is retained; especially worthless is consistency of interface.

Early Unix and C are examples of the use of this school of design, and I will call the use of this design strategy the New Jersey approach I have intentionally caricatured the worse-is-better philosophy to convince you that it is obviously a bad philosophy and that the New Jersey approach is a bad approach.

However, I believe that worse-is-better, even in its strawman form, has better survival characteristics than the-right-thing, and that the New Jersey approach when used for software is a better approach than the MIT approach.

Olin Shivers, 1998

* Preamble: 100% and 80% solutions
There’s a problem with tool design in the free software and academic
community. The tool designers are usually people who are building tools for
some larger goal. For example, let’s take the case of someone who wants to do
web hacking in Scheme. His Scheme system doesn’t have a sockets interface, so
he sits down and hacks one up for his particular Scheme implementation. Now,
socket API’s are not what this programmer is interested in; he wants to get on
with things and hack the exciting stuff — his real interest is Web services.
So he does a quick 80% job, which is adequate to get him up and running, and
then he’s on to his orignal goal.

Unfortunately, his quickly-built socket interface isn’t general. It just
covers the bits this particular hacker needed for his applications. So the
next guy that comes along and needs a socket interface can’t use this one.
Not only does it lack coverage, but the deep structure wasn’t thought out well
enough to allow for quality extension. So *he* does his *own* 80%
implementation. Five hackers later, five different, incompatible, ungeneral
implementations had been built. No one can use each others code.

The alternate way systems like this end up going over a cliff is that the
initial 80% system gets patched over and over again by subsequent hackers, and
what results is 80% bandaids and 20% structured code. When systems evolve
organically, it’s unsuprising and unavoidable that what one ends up with is a
horrible design — consider the DOS -> Win95 path.

As an alternative to five hackers doing five 80% solutions of the same
problem, we would be better off if each programmer picked a different task,
and really thought it through — a 100% solution. Then each time a programmer
solved a problem, no one else would have to redo the effort. Of course, it’s
true that 100% solutions are significantly harder to design and build than 80%
solutions. But they have one tremendous labor-savings advantage: you don’t
have to constantly reinvent the wheel. The up-front investment buys you
forward progress; you aren’t trapped endlessly reinventing the same awkward

But here’s what I’d really like: instead of tweaking regexps, you go do your
own 100% design or two. Because I’d like to use them. If everyone does just
one, then that’s all anyone has to do.

Kevlin Henney, 2017:

A common problem in component frameworks, class libraries, foundation services, and other infrastructure code is that many are designed to be general purpose without reference to concrete applications. This leads to a dizzying array of options and possibilities that are often unused or misused — or just not useful.

Generally, developers work on specific systems; specifically, the quest for unbounded generality rarely serves them well (if at all). The best route to generality is through understanding known, specific examples, focusing on their essence to find an essential common solution. Simplicity through experience rather than generality through guesswork.

Speculative generality accumulates baggage that becomes difficult or impossible to shift, thereby adding to the accidental complexity those in development must face in future.

Although many architects value generality, it should not be unconditional. People do not on the whole pay for — or need — generality: they tend to have a specific situation, and it is a solution to that specific situation that has value.

We can find generality and flexibility in trying to deliver specific solutions, but if we weigh anchor and forget the specifics too soon, we end up adrift in a sea of nebulous possibilities, a world of tricky configuration options, overloaded and overburdened parameter lists, long-winded interfaces, and not-quite-right abstractions. In pursuit of arbitrary flexibility, you can often lose valuable properties — whether intended or accidental — of alternative, simpler designs.

Ok, the last one is a bit more…specific…than the first two. But it’s fun to read it in juxtaposition with the first two. One way to try bridge the difference between Henney and Shivers is to not that Shivers is saying that we need more 100% designs and Henney is saying that we need a lot of specific experience to get to a good 100% design. But then the differences becomes stronger…Shivers doesn’t want people to hack up a bunch of 80% solutions while Henney, roughly, thinks we have to have them before we have a hope for a right 100% one.

My heart is with Shivers, but my head is with Henney.

I think I have some readings and an exam question for next year’s class.

Ah, Grading

Lost track of posting and lots of other things due to the whelm being over but not done. Some of the grading is going OK. Exams should be sortable. I hope to be back on posting track tomorrow.

Blackboard Learn 9.x Fail Encore

Last year, our installation of Blackboard could upload grades from a spreadsheet. So you could grade offline! Which is good, because we want to grade off line, esp. programs. But boo! You couldn’t upload feedback so even though the feedback was sitting in a column ready to go, we had to cut and paste it in. BOO!

But then, in spring, a service pack made it possible up load  (and download!) feedback. WOO! This is good! I can grade offline! I can use my tools! I can analyze stuff!

Except I now figure out that if I have multiple question tests, I can’t upload feedback OR MARKS for individuals questions in the test. Which, for something like, oh, I don’t know, a FINAL EXAM is a big deal.

It’s also going to suck for giving feedback. Lots of cutting and pasting in my future.

Software as a service folks of the world, there is a MINIMUM REQUIREMENT on you: Make sure your users can export and import your data. Easily. Very easily. Make it easy, ok? Use freaking XML if you have to. Just make it easy. From day 1. Until day always. For proper bonus points, make sure that simple things can be done simply. But if not that, just make sure we can do it.

Users of SAAS, demand this. DEMAND IT. If they can do it, you should worry.

A Cautionary Tale

It’s hard being a PhD student.

Having been one for quite a long time, I can speak quite passionately about it. Being a passionate person entails that I probably will at the drop of a hat.

Of course, lots of the difficulties with being a PhD student are simply a matter of life. I take a special interest because it was a defining condition of so much of much life and mentoring PhD students will is and will be such a condition for the rest of my life. So when I see a massive failure by a PhD student, I’m inclined to overreflect on it.

Kindred Winecoff posted quite a silly critique of Paul Krugman which was picked up by Henry Farrell. Now, Daniel Drezner has a similar, somewhat more nuanced view expressed with rather less vitrol and hyperbole. They share the same basic flaw: A hugely uncharitable misreading of Krugman as saying that the public bears absolutely no responsibility for since it had no influence on the massively disastrous Bush and Bush era policies. (I’m risking similar problems by not doing a very close exegesis of any of the articles. Furthermore, my generally pro-Krugman bent generates similar risks as Winecoff’s anti-Krugman bent.)

(The big error in this reading, AFAICT, is to miss the dialectic at several levels. The line Krugman is pushing back against is the one which justifies austerity measures with a massive negative effect on the poor and powerless along with irresponsible give aways to the rich and powerful. While there are piles of crap justifications, the key one here is that the public is irresponsible and the elites are relatively helpless in the face of massive public irresponsibility. (Think Santelli.) Whatever responsibility the public bears, I trust that it’s pretty obvious that this line is total nonsense and that’s Krugman’s core point. And, frankly, it’s the interesting point.)

Winecroff is now in a trap of their own making (yes, like Jane Austen, use the 3rd person plural as a neutral 3rd person singular). They gave a junky critique based on a junky reading and littered it with junky hyperbole, e.g.,

If Greenspan’s “with notably rare exceptions” deserves internet infamy, and it does, then surely Krugman’s less notable exceptions should too.

(Even if the junky reading were correct these are not remotely comparable. If the junky reading were correct, Krugman would be wrong (this is what Drezner tries, rather crappily afaict, to show). Greenspan is engaged in a kind of amazing and disgusting chuzpah in the service of some rather dangerous hackery.)

When appropriately (and gently!) chastized by Farrell, Winecroff fails to do the sensible thing that many commentators urged him to do: Take a moment, reflect, and back down. Instead, Winecroff doubles- and trebles-down on the silliness. The silliness is as every level including a classic “I’m leaving thread now” followed almost immediately by several more comments.

All this is relatively minor in the grand scheme of things: In the midst of an event like this, it’s really hard to turn oneself around. But given the systematic failures exhibited, I wonder if Winecroff is going to learn from it. If I were his supervisor (US: advisor), I would print all these out and go through them carefully. I’d probably focus more on the dialectic issues (e.g., problems with burden of proof, charity, self-awareness, tactics, and strategy, etc.). For example, it’s very unclear what Winecroff hopes to get out of the exchange. I’m afraid that bashing Krugman is core, which is really a worthless goal, esp. in this context. An easy win would have been to say, “Ok, let’s put my reading of Krugman aside (I’m not ready to give up on it, but maybe that’s because I really can’t stand him; I have to let that rest for awhile) and focus on the more interesting question of how to apportion responsibility for policy.”

This only wins if making the point is more important than making the bash. Which is why it’s a good move regardless of your goal if you are in hostile territory. It sidelines bashback for a while in favor of counterpoint. Given enough point and counterpoint, you might find your own goal moving from bashing to pointmaking. (This is not to say that bashing is worthless. Sometimes it’s very worthwhile indeed. But it needs to work, at the very least.)

As I said, Winecroff isn’t irrecoverable. I had a similar (more heated) exchange with a random PhD student on the web and they turned out just fine and we’re reasonble colleagues (I’m still a bit wary of them, though). Of course, I had a similar (even more heated) exchange which did not resolve favorably. If you find yourself in this circumstance, get as much reality checking as you can. Reflect. Talk to other (possibly critical) people. Don’t necessarily seek out supportive people, but people who will tell you when you’re off the rails. If you determine you have gone off the rails, apologize and retract and learn from the experience. In particular, learn something about your own strengths, weaknesses, and reactions.

Update: You don’t have to be a student to have major level fail as the Synthese scandal shows. The solution to such fails is the same.

However, the action Frances recommend (apologize first) works best in good faith circumstances. If there’s bad faith or bad blood admitting fault early can really, really screw you. Asking for time to think about it, or putting up similar disclaimers, can be useful. It really is the case that we fallible people sometimes can’t see the obvious. If you aren’t seeing it, then ask for some time to see it. “Hey folks, I’m seeing a lot of heat from people I generally respect but I’m not getting it. Can we hold things for a bit while I figure out for sure what’s going on?” is a reasonable move.