Notes on good judgement and how to develop it

Judgement, which I roughly define as ‘the ability to weigh complex information and reach calibrated conclusions,’ is clearly a valuable skill.

In our simple analysis of which skills make people most employable, using data from the Bureau of Labor Statistics across the US economy, ‘judgement and decision making’ came out top (though meant in a broader sense than we do).

My guess is that good judgement is even more important when aiming to have a positive impact.

What follows are some notes on why good judgement matters, what it is, and what we know about how to improve it.

Why good judgement is so valuable when aiming to have an impact

One reason is lack of feedback. We can never be fully certain which issues are most pressing, or which interventions are most effective. Even in an area like global health – where we have relatively good data on what works – there has been huge debate over the cost effectiveness of even a straightforward intervention like deworming. Deciding whether to focus on deworming requires judgement.

This lack of feedback becomes even more pressing when we come to efforts to reduce existential risks or help the long-term future, and efforts that take a more ‘hits based’ approach to impact. An existential risk can only happen once, so there’s a limit to how much data we can ever have about what reduces them, and we must mainly rely on judgement.¹

Reducing existential risks and some of the other areas we focus on are also new fields of research, so we don’t even have established heuristics or widely accepted knowledge that someone can simply learn and apply in place of using their judgement.

You may not need to make these judgement calls yourself – but you at least need to have good enough judgement to pick someone else with good judgement to listen to.

In contrast, in other domains it’s easier to avoid relying on judgement. For instance, in the world of for-profit startups, it’s possible (somewhat) to try things, gain feedback by seeing what creates revenue, and refine from there. Someone with so-so judgement can use other approaches to pursue a good strategy.

Other fields have other ways of avoiding judgement. In engineering you can use well-established quantitative rules to figure out what works. When you have lots of data, you can use statistical models. Even in more qualitative research like anthropology, there are standard ‘best practice’ research methods that people can use. In other areas you can follow traditions and norms that embody centuries of practical experience.

I get the impression that many in effective altruism agree that judgement is a key trait. In the 2020 EA Leaders Forum survey, respondents were asked which traits they would most like to see in new community members over the next five years, and judgement came out highest by a decent margin.

Good judgement (weighing complex information and reaching calibrated conclusions)	5.8
Analytical intelligence	5.1
Entrepreneurial mindset (being able to make things happen independently)	5.0
Independent thinking (developing one's own views)	5.0
Altruism/Prioritizing the common good	4.6
Honesty/Transparency	4.4
Emotional intelligence/Social skills	4.3
Grit and work ethic	3.6
Ambition	3.3
Creativity	3.1

It’s also notable that two of the other most desired traits – analytical intelligence and independent thinking – both relate to what we might call ‘good thinking’ as well. (Though note that this question was only about ‘traits,’ as opposed to skills/expertise or other characteristics.)

I think this makes sense. Someone with unusually good and trusted judgement can decide what an organisation’s strategy should be, or make large grants. This is valuable in general, though the community also seems to be especially constrained by people able to do this kind of work right now due to the funding overhang. Many of the bottlenecks faced by the community right now also involve research, and this requires a lot of judgement. When we’ve looked into the traits required to succeed in our priority paths, good judgement usually seems very important.

One promising feature of good judgement is that it seems more possible to improve than raw intelligence. So, what – more practically – is good judgement, and how can one get it?

More on what good judgement is

I introduced a rough definition above, but there’s a lot of disagreement about what exactly good judgement is, so it’s worth saying a little more. Many common definitions seem overly broad, making judgement a central trait almost by definition. For instance, the Cambridge Dictionary defines it as:

The ability to form valuable opinions and make good decisions

While the US Bureau of Labor Statistics defines it as:

Considering the relative costs and benefits of potential actions to choose the most appropriate one

I prefer to focus on the rough narrower definition I introduced at the start (and which was used in the survey I mentioned above), which makes judgement more clearly different from other cognitive traits:

The ability to weigh complex information and reach calibrated conclusions

More practically, I think of someone with good judgement as someone able to:

Focus on the right questions
When answering those questions, synthesise many forms of weak evidence using good heuristics, and weigh the evidence appropriately
Be resistant to common cognitive biases by having good habits of thinking
Come to well-calibrated conclusions

Owen Cotton-Barratt wrote out his understanding of good judgement, breaking it into ‘understanding’ and ‘heuristics.’ His notion is a bit broader than mine.

Here are some closely related concepts:

Keith Stanovich’s work on ‘rationality,’ which seems to be something like someone’s ability to avoid cognitive biases, and is ~0.7 correlated with intelligence (so, closely related but not exactly the same)
The cluster of traits (listed later) that make someone a good ‘superforecaster’ in Philip Tetlock’s work (Tetlock also claims that intelligence is only modestly correlated with being a superforecaster)

Here are some other concepts in the area, but that seem more different:

Intelligence: I think of this as more like ‘processing speed’ – your ability to make connections, have insights, and solve well-defined problems. Intelligence is an aid in good judgement – since it lets you make more connections – but the two seem to come apart. We all know people who are incredibly bright but seem to often make dumb decisions. This could be because they’re overconfident or biased, despite being smart.
Strategic thinking: Good strategic thinking involves being able to identify top priorities, develop a good plan for working towards those priorities, and improve the plan over time. Good judgement is a great aid to strategy, but a good strategy can also make judgement less necessary (e.g. by creating a good backup plan, you can minimise the risks of your judgement being wrong).
Expertise: Knowledge of the topic is useful all else equal, but Tetlock’s work (covered more below) shows that many experts don’t have particularly accurate judgement.
Decision making: Good decision making depends on all of the above: strategy, intelligence, and judgement.

Some notes on how to improve your judgement

How to improve judgement is an unsolved problem. The best overview I’ve found on what’s currently known is Open Philanthropy’s review of the research into methods to improve the judgement of their staff by Luke Muelhauser. The following suggestions are aligned with what they conclude.

In particular, the suggestions draw significantly on Phil Tetlock’s research into how to improve forecasting. This is the single best body of research I’m aware of in the area of improving judgement.

Tetlock’s research stands out from other research into improving decision making because:

He developed a way to quantitatively measure judgement, by tracking people’s accuracy at predicting events in current affairs.
This meant he could identify the best forecasters and their habits.
What he learned from this was used to create a training programme, which was then tested in a randomised controlled trial, while most ways to improve decision-making techniques have not been rigorously tested.

Tetlock wrote a great popular summary of his research, Superforecasting. We have a summary of the book and interview with him on our podcast (followed by a second interview).

You can see a more thorough review of Tetlock’s work prepared by AI Impacts, with a lot of fascinating data. For instance, the training programme was found to improve accuracy by around 10%, with the effect lasting for several years.

Forecasting isn’t exactly the same as good judgement, but seems very closely related – it at least requires”weighing up complex information and coming to calibrated conclusions”, though it might require other abilities too. That said, I also take good judgement to include picking the right questions, which forecasting doesn’t cover.

All told, I think there’s enough overlap that if you improve at forecasting, you’re likely going to improve your general judgement as well. I don’t cover other ways to improve judgement as much, because I don’t think they have as much evidence behind them.

So here are some ways to improve your judgement:

Spend an hour or two doing calibration training

Being well calibrated is an important input into judgement, and I mention it as part of my short definition of judgement at the start. It means being able to quantify your uncertainty so that if you say you’re 80% confident in a statement, you’ll be right four out of five times.

This is important because there’s a big difference between 20% and 80% confidence, but these could easily both be called ‘likely’ in natural language.

There is evidence to suggest that people can improve their calibration in just an hour of training, and there’s some chance this transfers across domains.

For this reason, Open Philanthropy commissioned a calibration training app, which you can try here.

Practice forecasting

As with any skill, the best way to improve is to actually practice. To improve your forecasting, you can practice making forecasts – especially if you also start to apply some of the techniques covered in the next section while doing it.

Here are some ways to practice:

Get in the habit of making predictions once a year or once a quarter (as does Scott Alexander, Vox, and Open Philanthropy), and tracking your accuracy. You can use PredictionBook for tracking.
Join a forecasting competition with the Good Judgment Project, or on Metaculus.
Take the training course we mentioned above.

One weakness of the research on forecasting is that it doesn’t cover how to focus on the right questions in the first place. This is an area of active research briefly covered in our second podcast with Tetlock.

Keep in mind that having calibrated overall judgement isn’t the only habit of thought that matters. Within a team you may want some people who generate creative new ideas and advocate for them even when they’re probably wrong, or challenge the consensus even when it’s probably right. That may be easier to do if you’re overconfident, so there may be a tension between what habits are best for individual judgement and what’s most helpful when contributing to a group’s collective judgement.

Apply these techniques

Luke from Open Philanthropy lists a few techniques for improving judgement that have some backing in the research:

Train probabilistic reasoning: In one especially compelling study (Chang et al. 2016), a single hour of training in probabilistic reasoning noticeably improved forecasting accuracy. Similar training has improved judgemental accuracy in some earlier studies, and is sometimes included in calibration training.

Incentivise accuracy: In many domains, incentives for accuracy are overwhelmed by stronger incentives for other things, such as incentives for appearing confident, being entertaining, or signalling group loyalty. Some studies suggest that accuracy can be improved merely by providing sufficiently strong incentives for accuracy, such as money or the approval of peers.

Think of alternatives: Some studies suggest that judgemental accuracy can be improved by prompting subjects to consider alternate hypotheses.

Decompose the problem: Another common recommendation is to break each problem into easier-to-estimate sub-problems.

Combine multiple judgements: Often, a weighted (and sometimes ‘extremized’) combination of multiple subjects’ judgements outperforms the judgements of any one person.

Here are Tetlock’s 10 commandments of forecasting (plus one meta-command), as summarised by AI Impacts:

Triage: Don’t waste time on questions that are ‘clocklike’ (where a rule of thumb can get you pretty close to the correct answer) or ‘cloudlike’ (where even fancy models can’t beat a dart-throwing chimp).

Break seemingly intractable problems into tractable sub-problems: This is how Fermi estimation works. One related piece of advice is “be wary of accidentally substituting an easy question for a hard one,” e.g. substituting “Would Israel be willing to assassinate Yasser Arafat?” for “Will at least one of the tests for polonium in Arafat’s body turn up positive?”

Strike the right balance between inside and outside views: In particular, first anchor with the outside view and then adjust using the inside view.

Strike the right balance between under- and overreacting to evidence: “Superforecasters aren’t perfect Bayesian predictors but they are much better than most of us.” Usually do many small updates, but occasionally do big updates when the situation calls for it. Take care not to fall for things that seem like good evidence but aren’t; remember to think about P(E|H)/P(E|~H) (Bayes’ theorem); remember to avoid the base-rate fallacy.

Look for the clashing causal forces at work in each problem: This is the ‘dragonfly-eye perspective,’ which is where you attempt to do a sort of mental wisdom of the crowds: have tonnes of different causal models and aggregate their judgements. Use ‘devil’s advocate’ reasoning. If you think that P, try hard to convince yourself that not-P. You should find yourself saying “On the one hand… on the other hand… on the third hand…” a lot.

Strive to distinguish as many degrees of doubt as the problem permits, but no more.

Strike the right balance between under- and overconfidence, between prudence and decisiveness.

Look for the errors behind your mistakes, but beware of rearview-mirror hindsight biases.

Bring out the best in others and let others bring out the best in you: One pervasive guiding principle is “Don’t tell people how to do things; tell them what you want accomplished, and they’ll surprise you with their ingenuity in doing it.” The other pervasive guiding principle is “Cultivate a culture in which people — even subordinates — are encouraged to dissent and give counterarguments.”

Master the error-balancing bicycle: This one should have been called practise, practise, practise. Tetlock says that reading the news and generating probabilities isn’t enough; you need to actually score your predictions so that you know how wrong you were.

Don’t treat commandments as commandments: Tetlock’s point here is simply that you should use your judgement about whether to follow a commandment or not; sometimes they should be overridden.

Try to develop the right mindset

The people with the best judgement also seem to have a certain mindset. Luke again:

According to some of the most compelling studies on forecasting accuracy I’ve seen, correlates of good forecasting ability include “thinking like a fox” (i.e. eschewing grand theories for attention to lots of messy details), strong domain knowledge, general cognitive ability, and high scores on “need for cognition,” “actively open-minded thinking,” and “cognitive reflection” scales.

And here’s Tetlock’s portrait of a good forecaster:

Philosophic outlook:

Cautious: Nothing is certain

Humble: Reality is infinitely complex

Nondeterministic: Whatever happens is not meant to be and does not have to happen

Abilities and thinking styles:

Actively open-minded: Beliefs are hypotheses to be tested, not treasures to be protected

Intelligent and knowledgeable, with a ‘need for cognition’: intellectually curious; enjoy puzzles and mental challenges

Reflective: Introspective and self-critical

Numerate: Comfortable with numbers

Methods of forecasting:

Pragmatic: Not wedded to any idea or agenda

Analytical: Capable of stepping back from the tip-of-your-nose perspective and considering other views

Dragonfly-eyed: Value diverse views and synthesize them into their own

Probabilistic: Judge using many grades of maybe

Thoughtful updaters: When facts change, they change their minds

Good intuitive psychologists: Aware of the value of checking thinking for cognitive and emotional biases

Work ethic:

Growth mindset: Believe it’s possible to get better

Grit: Determined to keep at it however long it takes

Spend time with people who have good judgement

I haven’t seen any research about this, but I expect that – as with many skills and mindsets – the best way to improve is to spend time with other people who exemplify them.

Spending time with people who have great judgement can help you improve almost automatically, by giving you behaviours to model, making it easy and fun to practice, giving you immediate feedback, and so on.

Anecdotally, many people I know say that what they found most helpful in improving their judgement was debating difficult questions with other people who have good judgement.

Learn about whatever you’re trying to make judgements about

Although experts are often not the best forecasters, all else equal, more domain knowledge seems to help. Luke from Open Philanthropy:

Several studies suggest that accuracy can be boosted by having (or acquiring) domain expertise. A commonly-held hypothesis, which I find intuitively plausible, is that calibration training is especially helpful for improving calibration, and that domain expertise is helpful for improving resolution.

Learn about when not to use your judgement

One of the other lessons of Tetlock’s work is that combining many forecasts usually increases accuracy, and the average of many forecasts is better than most individuals. This suggests that another way to make better judgements is to seek out other people’s estimates, and to sometimes use them rather than your personal impressions.

Exactly how much weight to put on the average estimate compared to your own views is a difficult question.

For topics like your choice of career, where you have a lot of information that others don’t, it makes sense to rely more on your own assessments.
For broader issues, we still need people to work to develop their personal impressions so that the average view can be well informed (if everyone defers to everyone else, no research will ever be done).

It’s also important to remember it can often still be worth acting on a contrarian position, so long as the upsides are much bigger than the downsides.

I hope to write more about how to balance judgement and independent thinking in another post (for now, I’d recommend reading In defence of epistemic modesty by Greg Lewis and the comments).

Learn more about decision making and good thinking

Read more about how to think better.
Listen to our podcast with Philip Tetlock, which starts with a summary of his research.
We also particularly recommend this article on minimal trust investigations for developing your judgement and thinking about when to defer.
See all our articles and podcasts on decision-making.

Notes on good judgement and how to develop it

Why good judgement is so valuable when aiming to have an impact

More on what good judgement is

Some notes on how to improve your judgement