October 2023

What is the Science of Linguistics a Science of?

My title1placeholder is a quote from Thomas G. Bever’s “The Cognitive Basis for Linguistic Structures” (346)2placeholder and reflects a tantalizing uncertainty manifested dramatically at the core of this discipline by what Scholz et al. describe as “enduring, rancorous, often ideologically tinged disputes over the past 45 years”.3placeholder The disputes have to do with, indeed, what the very subject-matter of linguistics— any language—is, and consequently what its sources of data are to be. One party claims that a language is a public, and publicly observable, phenomenon or class of phenomena, to be studied as any other such phenomenon or class would be by the relevant empirical science: by collecting aggregates (technically called corpora) of actual speech use (as we would collect aggregates of actual planetary motions on our way to a theory of the Solar System) and identifying in them regularities that might support predictions of what to expect in the future. The other party, dominated to this day by the august figure of Noam Chomsky, claims that language is an abstract structure, typically referred to as grammar or syntax, that guides the performances of any native, competent speaker of it, but, since its “guidance” is somewhat shaky and performances are occasionally, even often, incomplete or faulty, it is to be studied not by looking at actual speech use but by eliciting judgments of grammaticality (or other linguistic parameters) from speakers, which supposedly allow to bring to light the linguistic intuitions that substantiate their competence. In Chomsky’s words, “The empirical data that I want to explain are the native speaker’s intuitions” (158).4placeholder Trying to be the least judgmental possible—that is, using terms the two parties use to refer to themselves—I will call the first one empirical and the second one generative.

The reason why the disputes are rancorous is not hard to see: each party’s conception of the subject-matter of linguistics would have us conclude that the other party’s is no science at all. On the one hand, we have statements such as the following, again by Chomsky: in light of the confusion that plagues actual speech use, “it is absurd to attempt to construct a grammar that describes linguistic behaviour directly” (130).5placeholder On the other, as many of the judgments elicited by a linguist in testing and confirming his hypotheses are his own, we have W. M. Labov urge that “linguists cannot continue to produce theory and data at the same time” (199).6placeholder In more painstaking (and painful) detail:

“[C]ollecting specimens of what people actually say or write […] will not lead us to form a very clear picture of what the sentences of English are, since first of all, not all these specimens will be sentences of English, and secondly, the number of sentences actually collected will not even begin to exhaust the totality of English sentences” (Langendoen 3).7placeholder

“As a native speaker of the language, [the linguist] is entitled to invent sentences and non-sentences in helping him to formulate and test his hypotheses. These abilities are usually referred to as linguistic intuitions and are important in that they form an essential part of the data-base of a Chomskyan approach to linguistics” (Atkinson et al. 38).8placeholder

“[Generative linguistics] continues to insist that its method for gathering data is not only appropriate, but is superior to others. Occasionally a syntactician will acknowledge that no one type of data is privileged, but the actual behavior of people in the field belies this concession. Take a look at any recent article on formal syntax and see whether anything other than the theorist’s judgments constitute the data on which the arguments are based” (Ferreira 372).9placeholder

“Sitting in an armchair drawing data out of one’s head is so comfortable an approach to academic research that perhaps we should not too surprised at how long some practitioners stick with it” (Sampson 5).10placeholder

When Chomsky started working in the field (and revolutionizing it), in the 1950s, there might have been a factual reason why the empirical approach was deemed hopeless: linguistic corpora are huge, and identifying regularities in them can prove daunting; so the “armchair,” introspective approach might have been adopted faute de mieux. Today, that reason no longer holds: computers can digest enormous amounts of text and bring out regularities in them. But, as additional evidence of how entrenched the respective positions are, this development has led less to a rethinking of the basis of linguistics than to the surfacing of a different—or at least differently named—discipline: computational linguistics. Sampson complains about it:

“The fact that modern empirical linguistics relies heavily on computers and corpora gives rise to one strategy of discourse whereby the armchair-dwellers who prefer to stick to intuitive data continue to represent their research style as the linguistic mainstream. Linguists who make crucial use of computers in their work are called “computational linguists,” and “corpus linguistics” is regarded as one branch of computational linguistics. Quite a lot of corpus linguists, the present author included, nowadays work in departments of computer science (where empirical techniques tend to be taken for granted) rather than departments of linguistics. […] In reality, corpus linguistics is not a special subject. To be a corpus linguist is simply to be an empirical linguist, making appropriate use of the available tools and resources which are enabling linguists at the turn of the century to discover more than their predecessors were able to discover, when empirical techniques were last in vogue. As Michael Hoey put it in a remark at a recent conference, “corpus linguistics is not a branch of linguistics, but the route into linguistics” (6).

It is, by all means, an unfortunate situation, as well as a tricky one. I don’t believe, as generative linguists tend to, that language is a uniquely human achievement and resource,11placeholder but certainly humans have perfected it like no other beings we are aware of (so far, at least, since artificial intelligence is threatening), to the point where we can regard it as their greatest achievement and resource; so how is it to be understood that scholars who have made language their area of research are in so sharp disagreement, and have been for decades, with no apparent resolution in sight, on the identity of their topic and on the ways of doing research on it?

For some, the way to go in the face of such a predicament is to attempt some kind of compromise between the two schools. This is how Carson T. Schütze describes his goal at the very beginning of The Empirical Base of Linguistics,12placeholder where he intends to bring responsible empirical methodologies to bear upon the study of linguistic intuitions:

“I aim to demonstrate in this book that grammaticality judgments and other sorts of linguistic intuition, while indispensable forms of data for linguistic theory, require new ways of being collected and used. A great deal is known about the instability and unreliability of judgments, but rather than propose that they be abandoned, I endeavor to explain the source of their shiftiness and how it can be minimized. I argue that if several simple steps are taken to remove obvious sources of bias, grammaticality judgments can provide an excellent source of information about people’s grammars. Thus, I respond to two of the most widespread criticisms of generative grammar—namely, that it involves constructing theories of intuition rather than of language use, and that it is highly subjective and biased by the views of the linguist. […] Linguists can expect to take away from this book numerous practical suggestions on how to collect better and more useful data, and on how to respond to criticisms of such data” (1).

My attitude here is different, indeed it is the very opposite. I see the contrast at hand as symptomatic of a fundamental ambiguity whose scope goes well beyond language, and I am interested in bringing it out in full force, not fudging it; in sharpening it, not creating the delusion of a soothing harmony—because it matters a lot to our form of life, and should not go unrecognized. Due to its vastness, the ambiguity is hard to perceive if one focuses too closely on the details of a single discipline or area of inquiry; so, being myself neither a linguist nor conversant with the bookshelves of related articles and volumes, but rather keen on addressing deep and general philosophical questions, I take it I am in the best position to do the debate, and the relevant abysmal chasm, full justice.

To begin with, after computers dispelled the only serious worry that hindered empirical linguistic work, this work is free to proceed, and none of the objections raised by generative linguists hold water. That people’s linguistic utterances are incomplete or faulty is no barrier to discovering empirical regularities in them, just as the fact that an obstacle may stop the course of an avalanche is no barrier to discovering empirical regularities in the falling of avalanches. That a corpus will not provide negative examples of ungrammatical utterances (another recurrent charge) is no problem either, and no reason to replace the study of corpora with the elicitation of responses to those kinds of utterances: we never see dropping a stone followed by it stopping in mid-air or levitating, but that is no reason to stop empirical inquiries into Newton’s laws and replace them with consulting our intuitions. And, to continue belaboring the obvious, that intuitions play a role in the construction of scientific theories is true; but, once a theory has been constructed, by whatever means, it is to be tested, confirmed or refuted on the basis of empirical, publicly observable evidence—doing otherwise would be to regress to obscurantist, unscientific eras of our tradition. The empirical study of linguistic corpora is, therefore, an absolutely fine scientific endeavor, which departments of linguistics would do well to find sizable room for.

To move to my next point, I bring up one last quote from Sampson’s book:

“[T]he empirical scientific method […] does not apply in every important domain. You cannot base a system of moral principles on the scientific method, because there is no way that observation can “refute” or “confirm” a moral principle such as “one should obey the law even if one disagrees with it.” As philosophers say, there is no way to derive an “ought” from an “is.” In such domains, one has to look for other ways to establish the validity of a set of beliefs. Typically, those other ways are less reliable than the scientific method; and so one finds that people often disagree about even basic moral questions” (1).

The reason why a moral theory (a system of moral principles) cannot be based on the scientific method is that it is not concerned with facts, which it is the scientific method’s task to establish, but rather with values (with what is found worthwhile) and with norms (with what, in light of it being worthwhile, ought to be done—for example, obey the law even if one disagrees with it). And here is the rub: There are ambitious, majestic moral theories. Aristotle’s derives his from the eudemonistic principle that flourishing is the human highest goal; Kant’s from the rational principle of the categorical imperative; Utilitarianism from the principle of maximizing the balance of pleasure over pain for the greatest number. But, ultimately, what those theories must respond to is human intuitions about what is worthwhile and, as a consequence, ought to be done. That is why the simplest form of Utilitarianism faces the scapegoat paradox: even if there was a way of maximizing the balance of pleasure over pain for the greatest number by inflicting horrible pain on a single person, most of us would intuitively reject that course of action, and as a result Utilitarians must go back to the drawing board and revise their theory accordingly.

Winslow Homer, "The Life Brigade", (1883)

Value-judgments, and the norms that go with them, are all over human experience. People express values, and related norms, when they judge a painting or a sunset to be beautiful, when they judge a certain behavior at table to be proper etiquette, when they judge a monastic life to be a better realization of the human potential than a mundane one (or vice versa), when they judge the stability of a relation to be preferable to the effervescence of it (or vice versa). And of course, as Sampson said (and some of my examples suggest), they sharply disagree on all these issues, voicing distinct, even contrary, judgments. Faced by such disagreements, and frustrated by them, one might want to change the subject into something more manageable, on which decisions can be more easily reached. One might, say, espouse legal positivism: give up on the question whether a law, or legal system, is worthwhile and ought be obeyed, and just review and analyze what legal systems there are. In the words of the English jurist John Austin:

“The existence of law is one thing; its merit and demerit another. Whether it be or be not is one enquiry; whether it be or be not conformable to an assumed standard, is a different enquiry” (157).13placeholder

Along these lines, the study of law becomes an empirical one, concerned with social facts and the regularities emerging there no less than the empirical study of language is. And, for many, this is an evasion from law’s true nature. The existence of the Nuremberg Race Laws in Nazi Germany, or of the apartheid laws in white-dominated South Africa, these people would say, were monstrosities that a proper legal inquiry should denounce, not “facts” it should line up with countless others. And, if the Nazis had won WWII and their laws had become universal—the law of the land everywhere—this outcome would have made them no less monstrosities, and would not have ruled out the right, indeed the duty, of anyone to disobey them. Fact or no fact.

What takes shape here is the same conflict we saw pester linguistics. On the one hand, we have those who want the study of law to be identified with the study of corpora of existing laws, and maybe use computers to find regularities in them. On the other, we have those who want that study to be based on their intuitions of what is right and hence legal—as opposed to what looks legal, or is even asserted (by various—as they believe, illegitimate—authorities) to be legal. Eschewing implicit references to Platonic noumena, I would say that the latter care about (their intuitions on) what ought to be legal—whether it is or not. In doing so, they will often cite momentous figures in the history of thought as being on their side, as when the UN Charter is said to be inspired by Kant’s On Perpetual Peace; but that is a disingenuous appeal to grandiloquent, ineffective argument: Kant, pietist as he was, found ultimate confirmation for his moral theory in that he took it to agree with the intuitions of the common person.

Notice, however, that the case of language is less like that of morality than that of table etiquette: moral intuitions (such as anyone has them) are supposed to apply across the board, to all human behavior, whereas linguistic intuitions are very specific. It’s not just that there are thousands of languages, and each speaker of any of them can have her own intuitions about what is grammatical in it, incommensurable with those of speakers of other languages (as well as possibly different from those of speakers of the same language); it’s also that those languages subdivide in myriads of jargons, each appropriate to particular contexts, and even the same speaker can have distinct intuitions about what is grammatical in the different contexts she participates in. So there is a place for introspection here: independently of how I contribute to the English corpus and to empirical studies on it, I can compare my own utterances, or those of others, with what I feel it would be grammatical to say on any given occasion, and find that some of my own utterances, or those of others, do not match up with that standard. (As would happen if I said “I didn’t do nothing,” which I regard as ungrammatical.)

In order to prove themselves scientific, generative linguists have tried to show that they, too, address facts and discover crucial regularities—in people’s linguistic intuitions, that is. Indeed some of them have argued that the regularities they discover are the most impressive: that they belong to language as such, not just to this language or that, even that they are hardwired in our mind or our brain. Good luck with that! is all I can tell them—after decades of trying, not a single such regularity has been established with the assurance we would expect of a scientific law. Or, rather, what regularities are found mirror the ways in which individual speakers were educated in school, or otherwise socialized. Which brings up another important point: intuitions can be at any level of development and refinement, and they can be brought to higher levels by being educated—that is, by being confronted gradually with more and more complex cases and thus growing themselves more and more complex, more and more structured.

Return to the law. In On Liberty,14placeholder John Stuart Mill writes the following:

“In many cases, though individuals may not do the particular thing so well, on the average, as the officers of government, it is nevertheless desirable that it should be done by them, rather than by the government, as a means to their own mental education—a mode of strengthening their active faculties, exercising their judgment, and giving them a familiar knowledge of the subjects with which they are thus left to deal. This is a principal, though not the sole recommendation of jury trial” (108).

Someone sitting on a jury for the first time may have only a rough intuition of how the law applies to individual cases, and it is just by hearing the ins and outs of the case, and the arguments presented by the opposing parties, that she can nurture better—more articulate, more comprehensive, more coherent—intuitions. Having gone through this exercise untold times, a jurist probably has the finest such intuitions. Same thing with language: most people’s untutored intuitions are rudimentary, sometimes inconsistent, and should be made to grow by appropriate exercise. It is linguists—not surprisingly, as they do that kind of exercise all the time—who have the most refined, sophisticated intuitions, so it is no surprise either that they should interrogate themselves so constantly: they have a lot more to deliver than the ordinary Joe, and the knotty problems they keep confronting will have the effect of making their deliveries ever more refined and sophisticated.

At this point, a generative linguist may think that, with friends like me, she does not need enemies. I have let her empirical counterparts run free; I have even granted them the prerogative of dealing with facts, leaving open to her only the Indian reservation of accounting for values. Even with values, I have assigned those concerning us here minimal scope, ridiculously small as compared with those applying to legal, or moral, theories: dismissing out of hand the exorbitant claims to a universal, and universally hardwired, grammar, I have confined them, mostly, to what is relevant to a small group that is privy to a jargon, or maybe a single individual working out his idiolect, or maybe that individual working out a single phase of his idiolect’s development. If this is what the whole enterprise comes to, would it not be a good idea to shut it down?

So I come to my last point: one that, I am sure, will not satisfy those asking the last question on behalf of generative linguistics, but also one that, I believe, ought to satisfy them, and indeed make the case for caring about linguistic intuitions for everyone. I begin by telling a personal story. The last time I taught a logic course was at a private Italian university that specializes in economics and business. In one of the first lectures, I made my usual point that advertisements are infested with fallacies. Taking as an example a video of George Clooney promoting Martini, I asked the students: What is the conclusion here? That we should buy Martini because Clooney drinks it? How much does Clooney know about Martini, or drinks in general? Do we even know that Clooney will come close to Martini when he is not paid to do so? In the past, such questions had elicited (the same verb used for what generative linguists do to intuitions—and not by chance) conspicuous nodding, and some knowing smiles. On the occasion, the class looked stolid, and, after a moment’s discomfort, one student spoke for all: What’s wrong with it, if it works?

“To work,” in the sense in which the student used the verb, contains a hidden parameter: it implies working for a purpose, where the nature of that purpose is not defined and hence is left entirely optional. Something might work, in this sense, in order to sell Martini or make a prisoner reveal secrets under torture. If one tells another, or oneself, to do that something, one is issuing what Kant called a hypothetical imperative, which falls well within the empirical realm of facts: it is a fact, or it isn’t, that doing it works for the purpose implied there. Using “to work” without mentioning for what gives the impression that one is venting a value; but that is not the case. As soon as the implied purpose is made explicit, what is vented is seen to be nothing other than a cause-effect relation, which may or may not hold but, whether it does or it doesn’t, is fully included in the factual province of empirical science. Nothing is being said on whether any of this is worthwhile. So the impassive reaction of my class, and of its spokesperson, was an indication of its being, or becoming, a foreigner to the very notion of value.

The overwhelming success of empirical disciplines in the modern world has made people, and increasingly young people, less and less sensitive to the distinctiveness of values. Sadly, even the late success of empirical linguistics is a brick in this wall. The problem—for those who see it as a problem—can be faced at a large scale, by launching campaigns that try to mobilize crowds against war, or for the rights of minorities, or for saving the planet. But I suspect that none of that will resolve the problem, in the long run, if it is not coupled with a more minute, humble, relentless effort directed at the microphysics of values: at the elementary judgments of what is right or wrong, correct or incorrect, valid or invalid, grammatical or ungrammatical, never mind how much any of it “works”—judgments that could well be diametrically opposed to one another, as long as expressing and defending them makes people more sensitive to the autonomy of values, much like members of a jury are made more sensitive to the autonomous integrity of the law. But the law is not always with us, whereas language is, it fills every nook and cranny in our lives; therefore, it is within language, first of all, that we should care to inject this sensitivity. Language is the quotidian scene on which values are, and ought to be, played out.

A couple of examples might help crystallize my last, and to me most important, point. Some people feel, strongly, that toilet paper ought to be rolled from the top; others feel, just as strongly, that it ought to be rolled from the bottom (I belong to the first category). Some people strongly feel it is right to peel a banana from one end; others strongly feel the opposite. I can imagine that one could draw subtle, interesting lessons from such feelings (such intuitions) about the personalities of the different classes of people—not, however, about toilet paper or bananas. At most, if you run statistics on them, they can help you market those products (thus perversely turning values to instrumental use). Similarly, some people feel strongly, as I do, that they ought not to say “I didn’t do nothing” (that it is ungrammatical to say so), and others feel just as strongly that that there is nothing wrong with that sentence; their intuitions, again, might tell us something about them (even about how to best deal with them), not about the English language. And, at any rate, the real enemy for me is the one who, if you ask him how to roll toilet paper, or peel a banana, or whether or not to say “I didn’t do nothing,” will answer: “Whatever works.”

So let empirical linguists collect and review their corpora, identifying regularities in them (by computers, if need be), and let generative linguists cultivate their own, and their students’, intuitions about the grammaticalness of whatever language or jargon they speak, and make their test utterances more and more intricate as their perceptiveness grows. What the latter are cultivating, perhaps unwittingly, is the place of values in a world of facts. It is a good battle to fight, on the side of the better angels of our nature.

Ermanno Bencivenga is a Distinguished Professor of Philosophy and the Humanities, Emeritus, at the University of California. The author of seventy books in three languages and one hundred scholarly articles, he was the founding editor of the international philosophy journal Topoi (Springer) for thirty years, as well as of the Topoi Library. Among his books in English are Understanding Edgar Allan Poe: They Who Dream by Day (Newcastle upon Tyne UK: Cambridge Scholars, 2023); Kant’s Copernican Revolution (New York: Oxford University Press, 1987); The Discipline of Subjectivity: An Essay on Montaigne (Princeton NJ: Princeton University Press, 1990); Logic and Other Nonsense: The Case of Anselm and His God (Princeton NJ: Princeton University Press, 1993); A Theory of Language and Mind (Berkeley and Los Angeles: University of California Press, 1997); Hegel’s Dialectical Logic (New York: Oxford University Press, 2000); Ethics Vindicated: Kant’s Transcendental Legitimation of Moral Discourse (New York: Oxford University Press, 2007); Theories of the Logos (Berlin: Springer, 2017).


I thank Salvatore Pistoia Reda for deep and illuminating discussions on the topic of this paper. Though there remain substantial disagreements between us, I have learned a lot from him.


