Introduction

Influenced by ontology and post-modern trends, critical discourse analysis (hereafter CDA) concerns social issues, focusing not only on language or language use but also on the linguistic features in cultural structures and social processes (Titscher et al., 2000; Sun and Cheng, 2017). Cognitive discourse analysis has been one of the most important research directions of CDA during the past 30 years, and this approach analyzes discourse from the perspective of cognitive psychology, attaching importance to the media role of cognition (Fairclough and Wodak, 1997; O’Hallora, 2003; Hart, 2010; van Dijk, 2014a). Among the studies, van Dijk (1998) suggests a socialized method to study cognitive psychology in discourse processing, attempting to discover discourse participation in social life from the linguistic perspective. Different from the discussion of language and social relations in sociolinguistics or linguistic anthropology, van Dijk’s socio-cognitive discourse analysis (hereafter SCA), also called “Discourse–cognitive–society Triangle”, studies psychological representation, discourse processing, shared knowledge and ideology (van Dijk, 1998, 2009a). In SCA, discourse is redefined as the form of social interaction and the reproduction of social cognition; thus, the social attribute of discourse is explained from the social and cognitive perspective. Moreover, the role of personal and social knowledge is essential in text interpretation (van Dijk, 2014a). Thus, this study explores Chinese court judgments from social, cognitive, and discursive perspectives, and analyzes a large amount of empirical data statistically to show the deep relationship among discourse components, cognitive sources, and corresponding social functions in the judicial discourse.

A court judgment is a common legal discourse and the final carrier of court trial activities. A court judgment is “a decision made by a court in respect of the matter before it” (Martin, 2009, p. 271), which may be interlocutory in deciding a particular issue prior to the trial of the case or disposing of the case finally. A court judgment may impose personal liability on a party or determine some issues of right, status, or property (Martin, 2009). The discursive presentation of different power agents can find its root in social construction because the discursive structure is constrained by and representative of social structure (Cheng, 2012). Regarding social relations, a court judgment is a corresponding rule on a specific social conflict. Due to the different types of social conflicts, there are civil, criminal, and administrative judgments in China. However, Chinese legal texts have much more informational and slightly abstract styles (Sun and Cheng, 2017). A court judgment is a model type of logical reasoning with tangible evidence; thus, it is significant to explore the cognitive source and social function according to the discourse components. This study builds a corpus with the Chinese court judgments (hereafter CCJ) of 1,740,000 words from the Supreme People’s Court (SPC) and the local people’s courts in China. There are 54 civil, 57 criminal, and five administrative judgments, including cases with murder, robbery, property disputes, and other sinful plots. Many data statistics and manual labeling are carried out in the discourse components, cognitive sources, and social functions to reveal the socio-cognitive discourse construction of Chinese court judgments.

Literature review

As a social value system, the law is composed of language. Even the rules, regulations, court trials, and various judicial procedures are all realized through language (John, 2003). In the sociology of law, many studies explore the forms of specific inequalities built into the law, legal processes, and decision-making (Seron and Munger, 1996; Sandefur, 2008). Although CDA shows interest in the role of language and communication in social functions, particularly in the exercise of power and control, it is notable that much less attention is paid to the analysis of legal language (Cheng and Machin, 2022). A court judgment is an essential type of legal discourse, while previous studies often focus on the wording, syntactic structure, and discourse style of legal provisions. Many comparative studies analyze the discourse styles of court judgments in different countries, concluding that the different roles of judges in the judicial system produce completely different discourse styles (Wetter, 1960; Kurzon, 2001; Cheng, 2007a; Cheng et al., 2008). The textual features of court judgments in different countries are constrained by and representative of their respective social structures from a multi-dimensional perspective (Bhatia, 2004; Cheng, 2010). Solan (2010) believes that in writing a court judgment, judges can only accept the views of evidence admitted by the judicial system and must decisively explain the trial. Besides the role of the judge, the value system of court judgments can be divided into legal and social value systems. Thus, judgments will reflect multiple values, which could be compatible or conflicting (Cheng, 2007b). In SCA, social cognition is located both in the social structure at the macro level and in the specific interactions or events at the micro level, which helps to reveal the deep rules of court judgments from the discourse analysis.

In CDA, the early researchers analyze the types of court judgments to explore their intertextuality, interpersonal relationship, and communicative purposes (Bhatia, 1993; Maley, 1994; John, 1994). Intertextuality permeates the judicial discourse in many different aspects, and all judgment precedents are a matter of intertextuality. Court judgments are the written forms of interaction among previous judges, litigants, lawyers, and expert witnesses, which embodies heavy intertextuality (Yu, 2021). In this sense, a court judgment involves the consultation among different levels of texts, and it should use the different texts effectively and simultaneously, taking into account the political and ideological functions and social customs (Bell and Pether, 1998). Meanwhile, legal texts exhibit distinctive discourse structures and message distribution depending on communicative functions and genres (Cheng et al., 2008). As the typical legal discourse, a court judgment is the discursive representation of judicial thinking, representing how judges apply the principles and methods in judicial proceedings through adjudication, including case hearings, trials, and decision-making (Cheng, 2010). In this way, a court judgment carries ideas and values and shapes social practices (Cheng and Machin, 2022). In the past 20 years, the research perspective of court judgments has been more diversified, especially of the Chinese judgments. Cheng et al. (2008) examine the linguistic characteristics, moves, and rhetoric of Chinese and American court judgments to specify the rhetorical preferences of “standard” judgments. Solan (2010) analyzes the reflexive pronouns in English court judgments, believing the ambiguity of reference can cause potentially disastrous consequences. Cheng (2012) deals with the attribution and judicial control from the authorial voices in a corpus-based study. Cheng and Cheng (2014) examine how epistemic modality is employed in civil judgments to construct legal facts and indicate legal probability. From a semiotic perspective, Wu and Cheng (2020) construct a model of evidentiality in Chinese court judgments. Yu (2021) argues that the reporting verbs reflect how judges identify the evidence of different documents in Chinese court judgments. Alghazzawi et al. (2022) use an LSTM + CNN neural network model with an optimal feature set to predict court judgments efficiently. Overall, these studies analyze the relationship between language and the law. From the perspective of CDA, the language of the law classifies the world and represents identities and human agency (Cheng and Machin, 2022).

Data and methods

Corpus linguistics (hereafter CL) is a methodology for studying the use of language. A corpus-based approach looks at the tangible evidence of the corpus and analyses the evidence to find out the probabilities, trends, patterns, and co-occurrences of elements, features, or groupings of features (Teubert and Krishnamurthy, 2007). The present study is situated within a corpus-based CDA since the non-obvious meaning is not accessible to the naked-eye and direct observation (Partington et al., 2013). CL helps the interpretations to be more trustworthy (Subtirelu and Baker, 2018), which runs against subjectivity and over-generalization (Hart and Cap, 2014). The integration of CL and CDA can be traced back to the 1990s (Subtirelu and Baker, 2018), which has been extensively utilized to approach the discourse of law (Wu and Sun, 2019; Zhao et al., 2021; Cheng and Machin, 2022), media (van Dijk, 2021; Pei et al., 2022). This paper conducts the socio-cognitive discourse analysis, and the corpora in the present study contain 1.74 million words, randomly selected from China Court NetworkFootnote 1). A corpus-based method not only accounts for a much broader range of data than introspective approaches but also produces more exact results by mechanical retrieval (Stefanowitsch and Gries, 2006).

The corpus search and analysis methods are word frequency profile, concordances, semantic analysis, and linear regression model (LRM). Firstly, the expressions indicating the cognitive source and social function are first retrieved and listed by the corpus tool, and Wmatrix3.0 is employed as a corpus analysis tool for identifying these expressions. Wmatrix3.0 functions for automatic semantic analysis of words in discourses are based on the tagset in UCREL Semantic Analysis System (USAS) (Rayson, 2008). Then, the word frequency lists and the co-text in relevant concordances are examined regarding the cognitive and social expressions by ConcGram1.0, a phraseological search program used to generate single word frequency lists from the corpus (Cheng et al., 2009; Greaves, 2009). In the analysis of the corpora above, Wmatrix3.0 and ConcGram1.0 are the primary tools, supplemented by manual labeling and calculation. As the key variables, some discourse components with cognitive sources and social functions are manually summarized and shown in Table 1. Next, based on LRM, Stata16.0 is used to explore the possible quantitative relationships between the variables. LRM is a mathematical model to determine the correlation between the variables. Finally, the discourse of Chinese court judgments is interpreted at cognitive and social levels from the perspective of van Dijk’s Discourse–Cognition–Society.

Table 1 Definition of key variables.

Based on LRM, this paper uses stata16.0 to explore the possible quantitative relationships between discourse component, cognitive source, and social function, and the models are shown in Eqs. (1)–(4) for details:

$$\begin{array}{l}{{{\mathrm{Cit}}}} = \alpha + \beta _1{{{\mathrm{Fai}}}} + \beta _2{\mathrm{Ind}} + \beta _3{\mathrm{Par}} + \beta _4{\mathrm{Inf}} + \beta _5{\mathrm{Wor}} \\\qquad+\, \beta _6{\mathrm{Phr}} + \beta _7{\mathrm{Sen}}\end{array}$$
(1)
$$\begin{array}{l}{{{\mathrm{Dep}}}} = \alpha + \beta _1{{{\mathrm{Fai}}}} + \beta _2{\mathrm{Ind}} + \beta _3{\mathrm{Par}} + \beta _4{\mathrm{Inf}} + \beta _5{\mathrm{Wor}} \\\qquad+\, \beta _6{\mathrm{Phr}} + \beta _7{\mathrm{Sen}}\end{array}$$
(2)
$$\begin{array}{l}{\mathrm{Dis}} = \alpha + \beta _1{{{\mathrm{Fai}}}} + \beta _2{\mathrm{Ind}} + \beta _3{\mathrm{Par}} + \beta _4{\mathrm{Inf}} + \beta _5{\mathrm{Wor}} \\\qquad+\, \beta _6{\mathrm{Phr}} + \beta _7{\mathrm{Sen}}\end{array}$$
(3)
$$\begin{array}{l}{{{\mathrm{Sum}}}} = \alpha + \beta _1{{{\mathrm{Fai}}}} + \beta _2{\mathrm{Ind}} + \beta _3{{{\mathrm{Par}}}} + \beta _4{\mathrm{Inf}} + \beta _5{\mathrm{Wor}} \\\qquad+\, \beta _6{\mathrm{Phr}} + \beta _7{\mathrm{Sen}}\end{array}$$
(4)

The study takes the social functions (citation, depiction, distance, and summary) as the explained variables, and cognitive sources (faith, induction, paraphrase, and inference) and discourse components (vocabulary, phrases, and sentences) as explanatory variables. The relationships between the four explained variables and seven explanatory variables are analyzed in the four models, and βi is the influence coefficient and α is a constant item. The regression results are shown in Table 2, which displays the data relationship between the explained and explanatory variables in CCJ. In Model (1), the regression results indicate that Fai has significant positive effects on Cit, and Phr also has significant positive effects on Cit. In Model (2), Ind and Par have significant positive effects on Dep. In Model (3), Par significantly positively affects Dis. In Model (4), Inf has a significant positive impact on the Sum; also, Phr and Sen have a significant positive impact on the Sum. On the whole, different cognitive sources influence the corresponding social functions, while Phr and Sen in discourse components impact some social functions.

Table 2 Regression result of Model 1 to 4.

In order to avoid the co-linearity among the research variables and distortion of the regression model, this paper uses the variance inflation factor (VIF) to test the degree of multi-collinearity among the observed variables with the help of Spss23.0. When VIF > 10, it indicates high multi-collinearity among the variables. The test results are shown in Table 3, and the VIF of each variable is <10, indicating that there is no multi-collinearity problem among variables, and the model results are valid.

Table 3 VIF test result.

In order to further study the citation function, Models (5)–(7) indicate the possible relationship between the discourse components and the specific segments of law, regulation, and evidence in the cognitive source of faith, respectively. The regression results are shown in Table 4. In Models (5)–(7), as the primary knowledge sources of faith, Law, Reg, and Evi all have a significant positive effect on Cit, while Phr and Sen have a significant positive effect on Cit. In Models (8)–(10), the three explanatory variables as Law, Reg, and Evi are tested simultaneously, and the results are unchanged.

$${{{\mathrm{Cit1}}}} = \alpha + \beta _1{{{\mathrm{Law}}}} + \beta _2{\mathrm{Wor}} + \beta _3{\mathrm{Phr}} + \beta _4{{{\mathrm{Sen}}}}$$
(5)
$${{{\mathrm{Cit2}}}} = \alpha + \beta _1{{{\mathrm{Reg}}}} + \beta _2{\mathrm{Wor}} + \beta _3{\mathrm{Phr}} + \beta _4{{{\mathrm{Sen}}}}$$
(6)
$${{{\mathrm{Cit3}}}} = \alpha + \beta _1{{{\mathrm{Evi}}}} + \beta _2{\mathrm{Wor}} + \beta _3{\mathrm{Phr}} + \beta _4{{{\mathrm{Sen}}}}$$
(7)
$${{{\mathrm{Cit1}}}} = \alpha + \beta _1{{{\mathrm{Law}}}} + \beta _2{{{\mathrm{Reg}}}} + \beta _3{\mathrm{Evi}} + \beta _4{\mathrm{Wor}} + \beta _5{\mathrm{Phr}} + \beta _6{{{\mathrm{Sen}}}}$$
(8)
$${{{\mathrm{Cit2}}}} = \alpha + \beta _1{{{\mathrm{Law}}}} + \beta _2{{{\mathrm{Reg}}}} + \beta _3{\mathrm{Evi}} + \beta _4{\mathrm{Wor}} + \beta _5{\mathrm{Phr}} + \beta _6{{{\mathrm{Sen}}}}$$
(9)
$${{{\mathrm{Cit3}}}} = \alpha + \beta _1{{{\mathrm{Law}}}} + \beta _2{{{\mathrm{Reg}}}} + \beta _3{\mathrm{Evi}} + \beta _4{\mathrm{Wor}} + \beta _5{\mathrm{Phr}} + \beta _6{{{\mathrm{Sen}}}}$$
(10)
Table 4 Regression result of Model 5 to 10.

Next, vocabulary, phrases, and sentences in the discourse components are further subdivided to explore the quantitative relationship between different social functions and cognitive sources in CCJ. In Models (11)–(14), the social functions (citation, depiction, distance, and summary) are still taken as the explained variables, and there are ten explanatory variables: (1) Fps+V: First-person subject+verb; (2) Tps+V: third person subject+verb); (3) Ups+V: unknown person subject+verb; (4) H-ad: adverb with high confidence; (5) M-ad: adverb with medium confidence; (6) L-ad: adverb with low confidence; (7) Pre-P: prepositional phrase; (8) Ver-P: verb-object phrase; (9) Con-C: conditional clause; (10) Cau-C: causal clause. The data could reflect how discourse components influence social functions. Models (15)–(18) respectively indicate how the cognitive sources of faith, induction, paraphrase, and inference are affected by different discourse components. The regression results are shown in Table 5:

$$\begin{array}{l}{{{\mathrm{Cit}}}} = \alpha + \beta _1{\mathrm{Fps}} + \beta _2{\mathrm{Tps}} + \beta _3{\mathrm{Ups}} + \beta _4{{{\mathrm{H-ad}}}} + \beta _5{{{\mathrm{M-ad}}}} \\\qquad\;\,\,+\, \beta _6{{{\mathrm{L-ad}}}} + \beta _7{{{\mathrm{Pre-P}}}} + \beta _8{{{\mathrm{Ver-P}}}} + \beta _9{{{\mathrm{Con-C}}}} + \beta _{10}{{{\mathrm{Cau-C}}}}\end{array}$$
(11)
$$\begin{array}{l}{{{\mathrm{Dep}}}} = \alpha + \beta _1{\mathrm{Fps}} + \beta _2{\mathrm{Tps}} + \beta _3{\mathrm{Ups}} + \beta _4{{{\mathrm{H-ad}}}} + \beta _5{{{\mathrm{M-ad}}}} \\\qquad\quad+\, \beta _6{{{\mathrm{L-ad}}}} + \beta _7{{{\mathrm{Pre-P}}}} + \beta _8{{{\mathrm{Ver-P}}}} + \beta _9{{{\mathrm{Con-C}}}} + \beta _{10}{{{\mathrm{Cau-C}}}}\end{array}$$
(12)
$$\begin{array}{l}{{{\mathrm{Dis}}}} = \alpha + \beta _1{\mathrm{Fps}} + \beta _2{\mathrm{Tps}} + \beta _3{\mathrm{Ups}} + \beta _4{{{\mathrm{H-ad}}}} + \beta _5{{{\mathrm{M-ad}}}} \\\qquad\;\;\;+\, \beta _6{{{\mathrm{L-ad}}}} + \beta _7{{{\mathrm{Pre-P}}}} + \beta _8{{{\mathrm{Ver-P}}}} + \beta _9{{{\mathrm{Con-C}}}} + \beta _{10}{{{\mathrm{Cau-C}}}}\end{array}$$
(13)
$$\begin{array}{l}{{{\mathrm{Sum}}}} = \alpha + \beta _1{\mathrm{Fps}} + \beta _2{\mathrm{Tps}} + \beta _3{\mathrm{Ups}} + \beta _4{{{\mathrm{H-ad}}}} + \beta _5{{{\mathrm{M-ad}}}} \\\qquad\quad\,+\, \beta _6{{{\mathrm{L-ad}}}} + \beta _7{{{\mathrm{Pre-P}}}} + \beta _8{{{\mathrm{Ver-P}}}} + \beta _9{{{\mathrm{Con-C}}}} + \beta _{10}{{{\mathrm{Cau-C}}}}\end{array}$$
(14)
$$\begin{array}{l}{{{\mathrm{Fai}}}} = \alpha + \beta _1{\mathrm{Fps}} + \beta _2{\mathrm{Tps}} + \beta _3{\mathrm{Ups}} + \beta _4{{{\mathrm{H-ad}}}} + \beta _5{{{\mathrm{M-ad}}}} \\\qquad\;\;\,+\, \beta _6{{{\mathrm{L-ad}}}} + \beta _7{{{\mathrm{Pre-P}}}} + \beta _8{{{\mathrm{Ver-P}}}} + \beta _9{{{\mathrm{Con-C}}}} + \beta _{10}{{{\mathrm{Cau-C}}}}\end{array}$$
(15)
$$\begin{array}{l}{{{\mathrm{Ind}}}} = \alpha + \beta _1{\mathrm{Fps}} + \beta _2{\mathrm{Tps}} + \beta _3{\mathrm{Ups}} + \beta _4{{{\mathrm{H-ad}}}} + \beta _5{{{\mathrm{M-ad}}}} \\\qquad\;\;\;+\, \beta _6{{{\mathrm{L-ad}}}} + \beta _7{{{\mathrm{Pre-P}}}} + \beta _8{{{\mathrm{Ver-P}}}} + \beta _9{{{\mathrm{Con-C}}}} + \beta _{10}{{{\mathrm{Cau-C}}}}\end{array}$$
(16)
$$\begin{array}{l}{{{\mathrm{Par}}}} = \alpha + \beta _1{\mathrm{Fps}} + \beta _2{\mathrm{Tps}} + \beta _3{\mathrm{Ups}} + \beta _4{{{\mathrm{H-ad}}}} + \beta _5{{{\mathrm{M-ad}}}} \\\qquad\;\;\;+\, \beta _6{{{\mathrm{L-ad}}}} + \beta _7{{{\mathrm{Pre-P}}}} + \beta _8{{{\mathrm{Ver-P}}}} + \beta _9{{{\mathrm{Con-C}}}} + \beta _{10}{{{\mathrm{Cau-C}}}}\end{array}$$
(17)
$$\begin{array}{l}{{{\mathrm{Inf}}}} = \alpha + \beta _1{\mathrm{Fps}} + \beta _2{\mathrm{Tps}} + \beta _3{\mathrm{Ups}} + \beta _4{{{\mathrm{H-ad}}}} + \beta _5{{{\mathrm{M-ad}}}} \\\qquad\;\;+\, \beta _6{{{\mathrm{L-ad}}}} + \beta _7{{{\mathrm{Pre-P}}}} + \beta _8{{{\mathrm{Ver-P}}}} + \beta _9{{{\mathrm{Con-C}}}} + \beta _{10}{{{\mathrm{Cau-C}}}}\end{array}$$
(18)
Table 5 Regression result of Model 11 to 18.

In Model (11), the regression results show that Pre-P and Cau-C significantly positively affect Cit. In Models (12) and (15), Fps and H-ad have significant positive effects on Dep and Ind, while Ups has significant adverse effects on Dep. Model (13) indicates Tps has significant positive effects on Dis, while Con-C has significant adverse effects on Dis. In Model (14), Pre-P, Ver-P and Cau-C all have significant positive effects on Sum, while Pre-P and Ver-P have significant positive effects on Fai in Model (15). In Model (17), Tps, Ups, H-ad and M-ad all significantly positively affect Par. Model (18) shows Pre-P, Ver-P, Cau-C and Con-C all have significant positive effects on Inf. In general, the vocabularies marked in CCJ exert a specific influence on the cognitive source of paraphrase and the social function of depiction, and the phrases and sentences marked in CCJ have a particular impact on the cognitive source of inference and social function of summary.

Robustness analysis

In order to ensure the validity and applicability of the empirical model, this study changes the original regression method to use the Tobit method to test the robustness of the model. The robustness analysis of Models (1)–(4) is shown in Table 6, and that of Models (11)–(18) in Table 7. Data show that the significance of variables has not changed, which is consistent with the regression results in Tables 2 and 5.

Table 6 Robustness analysis of Model 1 to 4.
Table 7 Robustness analysis of Model 11 to 18.

Findings and analysis

CDA holds a “critical perspective, position or attitude within the discipline of multi-disciplinary Discourse Studies” (van Dijk, 2016, p. 62), focusing on various forms of the complex relations between social structures and discourse structures (van Dijk, 2018). This study draws upon van Dijk’s SCA to link discourse components to cognitive sources and social functions through a complex socio-cognitive interface.

Discourse dimension

On the one hand, similar to Saussure’s “utterance”, van Dijk’s (2008) discourse refers to the language in practical use, which is a complex and hierarchical phenomenon consisting of the form (vocabulary, phrase, and sentence), meaning, and action. On the other hand, as a part of social practices, discourse is constructively connected with the context and discourse subject, further developing Foucault’s (1995) discourse view. Through the statistical analysis of CCJ, the vocabularies indicating the cognitive source and social function include verbs and adverbs, among which the verbs consist of speech verbs, sensory verbs, cognitive verbs, and modal verbs, and the adverbs are categorized with the degree of value. The verb is the core of Chinese syntactic and semantic structure (Wu and Cheng, 2020). The general use of different verbs shows the cognitive source of the discourse subject and the attitude to the information mentioned. In the linguistic study of the Chinese Language, speech verb means “speaking”, that is, expressing meaning with words. “Speaking” is a collective concept, the most crucial subcategory in the semantic field of speech (Wang, 2004). The top 3 high-frequency speech verbs in CCJ are “shuo1 (say)” (1276 items), “cheng1/ sheng1cheng1 (express)” (741 items), and “zhu3zhang1 (claim)” (496 items). In Extract 1, witness Cui uses “shuo1 (say)” to paraphrase Xu’s speech content, which means the information source expressed by the speaker comes from another’s utterance or writing (Hu, 1995), and thus the speech verb “said” marks the cognitive source of paraphrase. Sensory verbs obtain external information through human senses, and the top 2 high-frequency sensory verbs in CCJ are “kan4/kan4jian4/kan4dao4 (see)” (348 items) and “ting1/ting1jian4/ting1dao4 (listen)” (182 items). In Extract 2, the sensory verb “kan4dao4 (saw)” marks the cognitive source of induction, since human makes further rational judgments through the direct perception of the sensory organs (Hu, 1995). The cognitive verb is related to the psychological activities of human beings (Kellogg, 2003), and it reflects subjective human cognition, which expresses psychological activities or states in meaning, mainly including “thinking, feeling and knowing” (Wu and Cheng, 2020). The top 2 high-frequency cognitive verbs in CCJ are “ren4wei2 (hold)” (912 items) and “zhi1dao4 (know)” (336 items). In Extract 3, the court of the second instance recites the judicial view of “the court of the first instance”, and uses the cognitive verb “ren4wei2 (held)” to mark the cognitive source of paraphrase. The modality refers to the speaker’s commitment to the proposition (Katriel and Dascal, 1989), and modal verbs show the speaker’s attitude toward the possibility of the information (Le, 2014). The top high-frequency modal verbs in CCJ are “ying1gai1 (should)” (2512 items). In Extract 4, the modal verb “ying1gai1 (should)” combines with the prepositional phrase “gen1ju4 (according to)” to indicate the cognitive source of inference and belief, since “contract” is a kind of documentary evidence, and the speaker conducts inference based on the existing information (De Haan, 2001).

Extract 1 Testimony of witness Cui: in the summer of 2008, Xu said (shuo1) that there was an office building project in a tourism service area of County Jing, and the cooperation had already been negotiated. If I paid the deposit, the contract could be signed.

Extract 2 Tao privately saw (kan4dao4) that Xue had more than 10 million yuan in a passbook.

Extract 3 The court of first instance held (ren4wei2) that: the defender of Jing, Xue, and Wang believes that the parties have conducted a civil lawsuit on the several facts of this case, and thus it should not be identified as a criminal case.

Extract 4 There should (ying1gai1) all be civil disputes, according to (gen1ju4) the contract signed by the appellant and Yang, the contract signed with Xishi Company, the contract signed with Hongda Company, the contract signed with the Nanjing branch of Shanghai Baoye Company, and the contract signed by Huahong Municipal Company.

In SCA, the participant is an important category, mainly referring to individuals in discourse. The relationship among speakers, listeners, and other discourse participants is dynamically constructed in the social context (van Dijk, 2008). In CCJ, the syntactical structure “third person subject+verb (Tps+V)” is generally used, accounting for 90.88% of all verb constructs that convey cognitive and social meaning, such as “the court of the first instance held” in Extract 3, which marks the information comes from a clear participant. The structure “first-person subject+verb (Fps+V)” accounts for 3.64%, as “wo3jue2de2 (I thought)” in Extract 5, yet the pronoun “I” does not refer to the court judge, but a direct reference to the testimony. The structure “unknown person subject+verb (Ups+V)” accounts for 5.48%, for example, “nong2hang2 peng2you3 (a staff from Agricultural Bank)” is an unknown person subject in Extract 6, and the predicate “da3 dian4hua4 jiang3(called to tell)” marks the cognitive source of paraphrase.

Extract 5 On the night of April 29th, Lu volunteered to babysit for my son, but I thought (wo3jue2de2) it was the first day we met, so I refused.

Extract 6 The statement from victim Yang: after I had paid the deposit, a staff from Agricultural Bank called to tell (nong2hang2peng2you3 da3dian4hua4jiang3) me that 500,000 RMB had been transferred, so I felt it was probably a cheat, and asked Jing to give the deposit back to the account to continue the project. Otherwise, he returned the deposit and canceled the project.

Personal cognition refers to the understanding mode of social group members subjectively to the discourse, which is manifested as the unique psychological representations of specific situations, events, and actions (van Dijk and Kintsch, 1983). Some adverbs indicating degree can show the personal cognition of the speaker to the confidence in court judgments, which can be divided into types of high confidence as “surely”, “certainly”, middle confidence as “about”, “probably” and low confidence as “like”, “seemly” (Cheng and Sin, 2011). In CCJ, the top high-frequency adverb with high confidence is “ken3ding4 (certainly)” (36 items), accounting for 25% of all adverbs for cognitive and social meaning. In Extract 7, “ken3ding4 (indeed)” indicates the high probability of information in Zhou’s cognition. The top 3 high-frequency adverbs with middle confidence are “ke3neng2 (perhaps)” (124 items), “da4gai4 (probably)” (56 items), and “da4yue1 (about)” (32 items), accounting for 63.10%. “Ke3neng2 (probably)” in Extract 8 marks the cognitive source of inference with middle confidence. The top high-frequency adverb with low confidence is “hao3xiang4 (seemly)” (36 items), accounting for 11.90%. In Extract 9, the speaker uses “hao3xiang4 (seemly)” to show low confidence in the contract amount. In SCA, social context is a process of individual construction and perception of the communicative situation (van Dijk, 2008). The court judgments dynamically construct the relationship between the speakers, listeners, and other discourse participants through verb structures and adverbs.

Extract 7 Zhou provided two receipts to the police office, which could prove the deposit was received indeed (ken3ding4).

Extract 8 The witness Bo has a mental disorder, so her ability to testify is in doubt. All of her testimony is made in the death sentence’s probation period, which may affect the testimony’s authenticity because she is probably (ke3neng2) under some extraordinary pressure or for the sake of her meritorious service.

Extract 9 Wang’s confession and defense: I remembered that I going Gaochun with Ma, Jing and a driver and signing a contract with a local company. The company seemly (hao3xiang4) paid a deposit of 200,000 or 300,000 RMB.

Through the semantic analysis with Wmatrix3.0 and ConcGram1.0, the present study holds the syntactic structures as the prepositional phrase (Pre-P), verb-object phrase (Ver-P), the causal clause (Cau-C) and a conditional clause (Con-C) are used to indicate the cognitive source and social function in the Chinese court judgments. In CCJ, the top high-frequency Pre-P and Ver-P are “gen1ju4/an4zhao4 (in accordance with)” (5004 items) and “ren4ding4/ zheng4ming2…shi4shi2 (identify the fact that…)” (3880 items). In addition, Pre-P and Ver-P have significant positive effects on Fai and Inf in Model (15) and (18), and Pre-P accounts for 55.26% of the whole phrases that convey socio-cognitive meaning, but Ver-P for 44.74%. As a knowledge source of faith (Wu and Cheng, 2020), the law is the enforceable body of rules that govern any society (Martin, 2009, p. 280), and the Pre-P “gen1ju4 (in accordance with)” in Extract 10 cites the legal provisions of the PRC Criminal Law, marking the cognitive source of faith. In Extract 11, combined with “To sum up” and the evidence in that court judgment, the Ver-P “ren4ding4…shi4shi2 (identify the fact that)” conducts the estimation based on the known information (De Haan, 2001) with high confidence, indicating the cognitive source of inference.

In CCJ, the top 3 high-frequency Cau-C are “yin1/yin1wei2 (because…)” (1592 items), “gu4 (therefore…)” (740 items) and “yin1ci3 (hence…)” (363 items), which accounts for 87.49% of the total clauses with socio-cognitive meaning, and the top high-frequency Con-C is “ru2guo3 (if…)” (456 items), accounting for 12.51%. Cau-C is generally used in Chinese court judgments because the result clause puts a fact or makes an inference, and the reason clause provides evidence to support it (Chen, 2009, p. 137). In Extract 12, the “zheng4ju4 (evidence)” is provided after the investigation in the reason clause, clause conjunction “yin1ci3 (hence)” leads to the conclusion and marks the cognitive source of inference. In Con-C, a hypothetical presupposition is built on the conditional clause (Chen, 2009, p. 133). As the conditional conjunction “ru2guo3 (if)” is marked in Extract 13, the result clause reflects the reasonable speculation after achieving presupposition.

Extract 10 This court made overall considerations about the harmful consequences to the criminal acts of defendant He, and He’s attitude toward the admission of guilt. In accordance with (gen1ju4) paragraph 1 and paragraph 3 of Article 239, Article 52, Article 53, and Article 64 of the Criminal Law of the People’s Republic of China, the judgment is as follows…

Extract 11 To sum up, it’s enough to identify the fact that (ren4ding4…shi4shi2) the defendant He stole the infant Li to blackmail money from the parents. Besides, He’s defense is inconsistent with the verified facts, so this court will not adopt it.

Extract 12 After investigation, the evidence in the case can confirm that the defendant Bo took advantage of his duties to help Tang and Xu’s business, hence (yin1ci3) received the money from them.

Extract 13 If (ru2guo3) the registered trademark infringes upon the prior rights of others, the prior right holder may directly bring a civil lawsuit without the administrative procedure of revoking the trademark.

The explicit language structure is inseparable from the implicit social relations since discourse structure in SCA is a logical analysis aiming to make microscopic connections to the dimensions outside the language (van Dijk, 2009a), such as social structure, situation and cognitive structure. As a realistic legal carrier, the discourse dimension of Chinese court judgments is both society-oriented and cognition-oriented. Unlike Saussure’s structuralism analysis, the present study analyzes the linguistic levels marking cognitive sources to establish a connection with the external world.

Cognitive dimension

From a cognitive perspective, CDA aims not only to describe the structural properties of text and talk, but also to account for how the cognitive, social, and political contexts influence the structures, strategies, and functions of text or talk (van Dijk, 2016). In the study of how language expresses a speaker’s attitude, Chafe (1986) holds there are four cognitive modes: belief, induction, rumor, and deduction because the knowledge sources of humans are fuzzy or unknowable, evidence, language, and hypothesis. However, Hu (1994) believes any individual, institution, or social–cultural experience is eventually stored in the culture, and all experiences can be reproduced. so the knowledge sources are revised into culture, sense, language, and hypothesis in Chinese. Therefore, the expressions indicating knowledge sources as culture, sense, rumor, and hypothesis are retrieved and listed by Wmatrix3.0 in CCJ. Then the cognitive sources of Chinese court judgments are classified into four types: faith (Fai), induction (Ind), paraphrase (Par), and inference (Inf).

Firstly, as the individual and social–cultural experience is stored in and can be reproduced in a culture (Hu, 1994), Fai refers to the common sense of the public and also the ordinary cognition of a particular speech community. As the knowledge sources of Fai, there are mainly laws, regulations (Reg), and evidence (Evi) in the court judgments (Wu and Cheng, 2020), which provides the social context for the trial result. In SCA, Fai is a part of social cognition, which includes socially shared emotions, attitudes, ideology, and memory structure (van Dijk, 2003). Law is the enforceable body of rules that govern society (Martin, 2009). The regulations are formulated and issued to conduct the behaviors of social organizations in the Administrative Laws and Regulations of China. In Extracts 14 and 15, the basis of the court decision is derived from “fan3bu2zheng4dang1 jing4zheng1fa3 (Anti-Unfair Competition law)” and “cheng2zhen4 qi3ye4 zhi2gong1 yang3lao3bao3xian3 zhuan3yi2 zan4xing2ban4fa3 (Interim Measures of Basic Retirement Security for Employees of Urban Enterprises)”, which marks the cognitive sources of faith. Evidence proves the existence or non-existence of some fact (Martin, 2009), and “ji1dong4che1 jiao1tong1shi4gu4 qiang2zhi4bao3xian3 (the traffic insurance policy of the motor vehicle)” is documentary evidence in Extract 16, which is confirmed and adopted by the court, also indicating the cognitive sources of faith. Due to the plenty of citations of Law, Reg, and Evi, Pre-P and Ver-P have significant positive effects on Fai in Model (15). In addition, the articles of the law are cited for 4.81% of the total words, and regulation for 1.6%, but evidence for 15.68% in CCJ. Secondly, induction is the cognitive process of reasoning, especially when a human makes further rational judgments through the direct perception of the sensory organs (Hu, 1995). Because Fps and H-ad positively affect Ind in Model (16), the cognitive source of induction in CCJ is mainly manifested as visual and auditory perception in the first-person perspective, and the direct perception from the sensory organs is given high confidence with H-ad. Thus, the sensory verb and the first-person subject are the primary markers, as shown in Extract 17, combined with the first-person subject “wo3 (I)”, the auditory verb “ting1dao4 (heard)” and the visual verb “kan4dao4 (saw)” mark the cognitive source of induction.

Thirdly, paraphrase in court judgments means the information expressed by the speaker not through his feelings, but from another person’s oral or written retelling (Wu and Cheng, 2020). Fairclough (1995) divides paraphrases into four types: direct paraphrase, free direct, indirect and paraphrase without marks, and Tps and Ups have significant positive effects on Par in Model (14), then the paper finds four sentence structures in CCJ: (1) information subject + verbal verb + direct paraphrase, and the direct paraphrase refers to the original statement and tone (Leech and Short, 1981); (2) information subject + verbal verb + indirect paraphrase, and the indirect paraphrase means that the information is expressed in the perspective of an interpreter in the current context (Leech and Short, 1981); (3) speaker + information subject + verbal verb + direct paraphrase; (4) speaker + information subject + verbal verb + indirect paraphrase, as in the case of “wo3ting1shuo1 (I heard from… that)” in Extract 18, and “a fellow villager” is the information subject, marking the cognitive source of paraphrase. Moreover, M-ad and L-ad have significant positive effects on Par in Model (17), which shows the speaker’s low confidence in paraphrase. Fourthly, inference is neither personal experience nor hearsay, which is based on reasoning or hypothesis from the known information (De Haan, 2001), and Con-C and Cau-C have significant positive effects on Inf in Model (18), so there are mainly causal inferences and conditional inferences in CCJ. Causal inference is widely used in Chinese court judgments, mainly supporting the verdict conclusion and case facts. In Extract 19, “enjoy the joint rights” shows the causal reasoning of case fact, and “ju4ci3 (therefore)” indicates the cognitive source of inference. On the contrary, conditional inference is rarely used in CCJ, referring to the reasoning of the judicial procedure and case facts. For instance, the conditional clause “(ru2guo3) If you refuse to accept this judgment” is a hypothetical possibility about the judicial procedure in Extract 20.

Extract 14 Item 3 in Article 5 of the PRC Anti-Unfair Competition Law (fan3bu2zheng4dang1 jing4zheng1fa3) stipulates that an operator shall not use another enterprise’s name without authorization, which may mislead the consumers for that enterprise’s commodity.

Extract 15 According to Article 3, Article 11, and Article 12 in Interim Measures of Basic Retirement Security for Employees of Urban Enterprises (cheng2zhen4 qi3ye4 zhi2gong1 yang3lao3bao3xian3 zhuan3yi2 zan4xing2ban4fa3), which is jointly issued by Ministry of Human Resources and Social Security and Ministry of Finance, if the employee gets a job in different provinces, his or her insurance should transfer to the latest area.

Extract 16 The plaintiff accordingly provides evidence 1, that is, the traffic insurance policy of the motor vehicle (ji1dong4che1 jiao1tong1 shi4gu4 qiang2zhi4bao3xian3), in order to prove that the accident vehicle, semi-trailer tractor (plate number: Min D87263) driven by the defendant Yao, has insured the traffic insurance from the defendant Xiamen branch of Pingan Insurance.

Extract 17 On the second floor of my bathroom, I heard (wo3ting1dao4) a woman shout help outside the window, I ran to the lawn outside the wall, saw (kan4dao4) a woman naked, crying in the corner side.

Extract 18 I heard from (wo3ting1shuo1) a fellow villager that he lost 80,000 RMB in the supermarket casino.

Extract 19 On April 17, 2013, Niu signed a contract with Birdman Company, which agreed that Birdman Company would exclusively invest in the production and promotion of Niu’s music albums. Therefore (ju4ci3), Niu and Birdman Company enjoy the joint rights of Niu’s performing activities in the market.

Extract 20 If (ru2guo3) you refuse to accept this judgment, you may submit an appeal to Beijing Intermediate People’s Court, and pay the court acceptance fee within 15 days from the date of the judgment. If (ru2guo3) you refuse to pay the fee within the time limit or are overdue, the appeal will be dismissed.

In SCA, social cognition is the mental structure used in the social context, which is shared by all members of social groups and cultures (van Dijk, 2003). The cognitive source of faith in CCJ mainly provides the legal provisions and judicial interpretation as the social background of judicial trial, accounting for 22.12% of the total discourse. However, personal cognition refers to the individual social group member subjectively producing and understanding the discourse, which is manifested as the unique psychological representations of specific situations, events, actions, and characters (van Dijk and Kintsch, 1983). The cognitive sources of induction and paraphrase are typical personal cognition, which proves and debates the testimony, documentary, or hearsay evidence between the sides of the prosecution. In CCJ, the discourse indicates induction and paraphrase separately account for 1.59% and 10.12%. The cognitive source of inference not only refers to the reasoning process in the debate but also provides confidence support for the judge to make a reasonable trial result, accounting for 11.71% in CCJ. Therefore, when personal cognition conforms to social consensus, it could change into social cognition in a court judgment. The study holds that social cognition has become a connection between individual ideas and social group attitudes, and the production of legal discourse is essentially the creation of social cognition.

Social dimension

van Dijk’s social dimension is a context of discourse communication, which is the container of objective reality referred to by discourse (Cheng and Wu, 2019). The social dimension involves social and group relationships, especially discourse-related power relations. SCA defines power as manipulation (van Dijk, 2014a), and the representation of manipulation is typical in legal discourse (Cheng and Cheng, 2012). Court judgments, demonstrating power and control (Wagner and Cheng, 2011), apply the rules of law to the case facts in dispute and make judicial interpretation and decisions to the parties. Thus, the present study holds that court judgments achieve multiple social functions by manipulating the discourse components and cognitive sources. Firstly, citation plays an essential social function in CCJ since personal cognition is limited. It is vital to support judicial ideas with other information which could come from social consensus. It is also proved in the regression analysis that the cognitive source of Fai has significant positive effects on Cit in Model (1) in CCJ. In further analysis, Law, Reg, and Evi in Fai have significant positive effects on Cit in Model (5)–(7), and discourse components of Cau-C and Pre-P are primarily used in the citation of law and Evi separately in Model (1). Secondly, Ind and Par have significant positive effects on Dep in Model (2), because the information from a speaker’s sensory experience (Wu and Cheng, 2020) and other’s utterance or writing could both conduct a social function of depiction. Model (12) shows Fps and H-ad often describe the direct perception of speaker’s sensory organs with high confidence, and Ups is the typical structure of Par. Thirdly, verbal paraphrase means the speaker marks the verbal behavior of another in the discourse (Thompson, 1996). In other words, the information provided by the speaker comes from another voice in a court judgment. By verbal paraphrasing, the speaker expresses the distance between oneself and the quoted voices (Luo, 2013), in order to separate oneself from the responsibility for the authenticity of the information in a court judgment and achieves the social function of distance. Besides, the quoted information in Cit is mainly the objective cultural truth, but that in Dis comes from the speaker’s sensory cognition. Models (3) and (13) show that Par’s cognitive source has significant positive effects on Dis, while Tps and Con-C have significant positive effects on Dis in CCJ. Fourthly, logical and reasonable judicial decisions are derived from one or several known facts or assumptions (Wu and Cheng, 2020) in court judgments, and thus the result of the judgment is a summary of the social behavior. The data shows that Inf significantly positively affects Sum in Model (4). There are mainly conditional inferences and causal inferences in CCJ. The former starts from a hypothetical statement, then draw the conclusion from the premise, and the latter summarizes the general conclusions from a series of specific facts (Zhang and Yu, 2003); thus, the social function of summary makes a general explanation of the reasoning mentioned above. Corresponding to the reasoning process, Con-C and Cau-C significantly positively affect Sum. At the same time, Pre-P and Ver-P are often used in the reasoning process, which also significantly positively affects Sum.

At the micro level, the social dimension of SCA is realized in the discourse behavior of text and conversation, which is caused by the social group relations and their common cognition at the macro level (van Dijk, 2009b). Therefore, as a particular type of judicial discourse for legal professionals (Cheng and He, 2016), the social function of the court judgments correspond to the cognitive source and builds the surface structure with various discourse components. In CCJ, faith with citation function provides a legal basis for judgment analysis, and induction with depiction function offers the information of direct experience for evidence. Paraphrase with distance function transfers others’ information to alleviate the legal responsibility for the authenticity of the information. In contrast, inference with summary function explains the reasoning process of court resolution.

Conclusion

SCA goes deep into the primary factors of language and society and plays a critical role in description and interpretation (Wu and Sun, 2019). From this perspective, this study aims to describe and explore the Chinese court judgments in a corpus method, concentrating on the interactions among the discourse, cognitve and social dimensions, which are deemed as the essential questions of CDA. This study finds that: (1) as a multi-dimensional social phenomenon, discourse needs to be explained reasonably in society, while human cognition is the intermediary factor (van Dijk, 2014b), and thus the discourse components could mark the cognition source in a court judgment. There are some verb structures such as Fps+V, Tps+V, and Ups+V, and the adverbs as H-ad, M-ad L-ad. Moreover, the phrases as Pre-P/Ver-P and sentences as Cau-C/Con-C are typical structures to indicate the cognition sources in CCJ. The relationship between discourse and cognition is the mapping relationship between language and psychology. (2) The cognitive source of Fai is a part of social cognition on Law and Reg, and that of Ind and Par provides the personal cognition to testimony, documentary, or hearsay evidence. In a court judgment, when personal cognition conforms to a social consensus (Xin and Liu, 2017), Inf could transfer the individual cognition to a part of the social cognition in the reasoning process. (3) Because of the opaque language and self-referential nature of the legal discourse (Cheng and Machin, 2022), the social functions of court judgments correspond to their cognitive sources and build the surface structure with various discourse components. The present study investigates the relationship between discourse, cognitive, and social dimensions in Chinese court judgments. However, further studies of the cognitive dimension are of critical value. In personal cognition, different audiences are ignored, such as the parties to the judgment case, legal practitioners, and the public. However, audiences’ understanding and acceptance are pivotal factors for court judgments. In other words, inter-semiotic operations are the core of interpreting court judgments (Cheng and Cheng, 2014). Moreover, most present studies on court judgments adopt a self-built corpus with insufficient data, which affects the effectiveness of results to a certain extent. Therefore, it is necessary to build a large shared corpus in the future to comprehensively and thoroughly examine the intersemiotic operation in court judgments.