The detail page

The detail page records the lexical and distributional details about a lexical unit (LU) contained in the DCS database. The detail pages are the central link between the query/dictionary and the analyzed texts that constitute the corpus (technical details).

Attention: Please do not link to the detail pages from other websites. These pages are generated dynamically from the DCS database, and internal parameters may change in future versions of DCS (refer to this page for technical details and to this page for how to cite DCS).

Each detail page is structured in the same way: it contains occurrences in alphabetical and chronological order, meanings, and verbal forms if the LU is a verb.
For more information about the KWIC link refer to the help page.

Occurrences in alphabetical order

The first type of information that is available for each LU is an alphabetical list of texts in which the LU is found. It consists of the following items:

The name of the text is linked to a new page, on which the occurrences of the LU in this text are displayed in detail.
A combination of numbers n₁/n₂ after the name of the text records the absolute number of occurrences of the LU in the text (n₁) and the absolute number of LUs in this text (n₂).
Example: Bodhicaryāvatāra 5/13608 means that the LU occurs 5 times in the Bodhicaryāvatāra and that the Bodhicaryāvatāra contains 13608 LUs.
The next two columns display this information in the form of two diagrams.
In the first diagram, the absolute frequency of the LU in the current text is compared with the absolute frequencies of the LU in the remaining texts. The widest block indicates the text that contains most of the occurrences of the LU, the narrowest one the text that contains the lowest number of occurrences of the LU.
The second diagram represents the same kind of information, but for relative instead of absolute frequencies. Therefore, the second diagram may describe the distribution of the LU in a more realistic way than the first representation.
In some cases, a percentual number in the form p ± x is printed in the last column of the text list. This number gives the percentual ratio of the LU in the current text (p=n₁/n₂; see above) and an estimation of the interval in which the "true" percentual ratio may be found (x; α=5%).³
Example: A value of 0.03298% ± 0.01454 for the LU śiva in the Bhāgavatapurāṇa means that the "true" ratio of śiva in the Bhāgavatapurāṇa is between 0.03298%-0.01454%=0.01844% and 0.03298%+0.01454%=0.04752%. This information can, for example, be used to compare percentual ratios of a LU in different texts in a statistically reliable way.

Occurrences in chronological order

In the following paragraph, all occurrences of the LU and the corresponding information (see above) are sorted according to five time slots.¹

The numbers of occurrences are grouped by the time slots to create the source data for a statistical evaluation. A table gives the observed and expected values that are used to perform a Χ² test.² Percentual ratios for the time slots are given according to the principles mentioned above. The Χ² test is only performed when at least two time slots are filled and if none of the expected values equals zero when rounded. If at least one of the observed cell values is below 5, a warning is issued. The resulting Χ² value indicates whether the LU is equally distributed over the time slots. If the Χ² is much above the critical value that is printed below the details of the calculation, it should be assumed that the LU is not equally distributed. In this case, there exists at least one time slot whose observed value differs significantly from the value that is expected according to Fn. 2. Of course, such a result does not prove the influence of time to the distribution of the LU because influences of style, literary type etc. have to be excluded. However, it is a first indication that a chronological influence may exist.
In the diagram given below the Χ² test, the differences between the observed and the expected values are displayed graphically. This diagram can be used to detect temporal trends in the distribution of a LU.

An example may help to clarify the use of these statistical methods. On the query or dictionary page, select the masculine noun śiva and navigate to the chronological details on the detail page. The Χ² test yields a value of 1018.644 (January 2010; may change in later versions) which is clearly above the critical value of 7.779. Therefore, śiva seems not to be distributed equally over the time slots. The diagram that is displayed below the Χ² test gives a hint to a possible temporal trend of this LU. The first time slot may be neglected for this evaluation because śiva is very rare in this slot. However, in epic literature śiva is found much less frequently than should be expected, while its frequency is above the expected number since classical times. This may indicate that the frequency of this LU increases in course of time. To assess this theory, we exclude the early and the epic occurrences and repeat the Χ² test with classical, medieval and late literature:

classical	810(829)	642533(642514)	643343
medieval	476(489)	379432(379419)	379908
late	402(370)	286747(286779)	287149
	1688	1308712	1310400

This test yields a Χ² value of 3.5806 which is below the critical value of 4.605. Therefore, śiva seems to be distributed regularly in (post-)classical literature.

In the last step of this example, we compare the distribution in early and epic literature with the distribution in (post-)classical literature. For this sake, we accumulate the values and create a 2x2 table:

early, epic	6+40=46	874421	874467
classical-late	1688	1308712	1310400
	1734	2183133	2184867

The Χ² value is 1009.638 (critical value: 2.706). Therefore, there seems to be a rather clear break in the frequency distribution of the LU śiva which may, for example, be explained by the development of religious ideas in the centuries after C.E. Again, we have to emphasize that such conclusions are only valid if other influences and especially biases in the corpus can be excluded!

Part of speech information

Nouns and adjectives

Each noun or adjective is accompanied by detailed information about how often each of its cases, numbers and genders occur in the database. This information may, for instance, be useful for detailed studies in lexico-syntactic features of Sanskrit words. The phrases containing the forms can be retrieved by clicking on the linked part of speech information.

Verbal forms

If the LU is a verb, a list of those finite and infinite verbal forms that are referenced in the text collection of the DCS is printed below the statistical evaluation. Each form is linked to a context page that shows the phrases containing them.

Meanings

At the end of the detail page, you find the meanings of the current LU in alphabetical order. An abbreviation enclosed in round brackets indicates the source of each meaning. In most cases, the source is MW = the digital version of Monier-Williams. Citations of other books are more detailed and are frequently accompanied by a page number. Note that these values are only abbreviations of more detailed information contained in the SanskritTagger database.