Introduction
  • What is the PolyU Language Bank?
  • What is in the Bank?
  • Why was the Bank created?
  • Uses of the Bank
  • How to use the Bank
  • How to use the Bank

    Selecting a corpus

    • Browse the contents (by discipline)

    Click on a discipline label from the pull-down menu to select a discipline of your choice. A list of the relevant corpora will appear on the main screen. Scroll down or click on a corpus name to go to the corpus description; click on a search corpus icon to search that corpus.

    The discipline types include: General, Academic, Business, Journalistic, Legal / Documentary and Literature. A corpus can fall into more than one of the above disciplines and thus be placed under more than one category. Corpora which only partially represent a particular discipline are also included under that discipline type.

    • Browse the entire collection (Index of Corpora)

    Click on 'Index of Corpora'. A table of all the corpora in the Bank will appear on the main screen. Each corpus has been assigned one or more descriptor values under each of four feature headings: Language, Discipline, Corpus type and Medium. Read across the rows to view the descriptor values beside the name of each corpus. Click on any corpus name / collection name in the first two columns of the table to go to the corpus description, then click on a search corpus icon to search that corpus.

    E.g. The sub-corpus of Business Letters (from the Corpus of Business Correspondence) is a collection of English texts from the Business discipline; it comprises both native speaker and learner data that were originally produced as written texts.

    • Selecting by language and other features (Search Corpora)

    Search for a corpus that contains your preferred combination of features using the four pull-down menus under the categories of Language, Discipline, Corpus type and Medium. You must specify a language in the first pull-down menu, and may additionally select a value under one or more of the other categories. A list of the corpora matching your chosen criteria will appear below the pull down menus (or 'No match results' if none match your criteria). Click on a corpus name to go to the corpus description; click on a search corpus icon to search that corpus.

    Conducting a search using the PolyU Language Bank Concordancer

    You must minimally enter a string that you wish to search for in your selected corpus, then scroll down to click on the Search for concordances button. Concordance lines will be displayed for items that match your specified search. Other features are optional (see below).

    • Search string

    Enter a search string (part of a word, or one or more words) and select the type of search required:

    equal to: Finds exact matches only E.g., 'in' finds 'in'
    starts with: Finds those items that start with the search string E.g., 'in' finds 'inaccurate, include, interpret...'
    ends with: Finds those items that end with the search string E.g., 'in' finds 'in, again, Britain...'
    contains: Finds all instances containing the search string E.g., 'in' finds 'again, meeting, original...'
    • Using the % sign

    This can represent any letter. E.g.,
    A search for equal to 'b%tt%n' will find 'button, bitten, batten'
    A search for equal to '%%tten' will find 'gotten, bitten, kitten...'
    A search for contains 'b%in' will find 'being, bring, rubbing, doubting...'

    • Associated word / Find / Range from keyword (non-Asian corpora only)

    If an item is entered in the associated word category, the concordancer will search only for those occurrences of the search string (keyword) which have the specified associated word(s) in its neighbourhood. Use find to specify the environment criterion: anywhere in the string, left of the keyword only or right of the keyword only. Use range from keyword to specify the proximity of the two words in terms of number of characters. The default setting is that there is no limit on distance (determined only by the line width). Example

    • Format

    The keyword is normally printed but can be gapped if this option is selected. The keyword is then replaced by '(____)' in each case. Use this option to create classroom concordancing exercises / cloze tests.

    • Numbering

    Select Yes to have the concordances numbered.

    • Sort type

    Opt to have concordances unsorted, or sorted alphabetically according to collocate position (the environment / word) to the left or right of the search item.

    • Collocate distance from keyword (non-Asian corpora only)

    Select the collocate distance from the keyword. Concordances may be sorted according to the collocate (word) positioned one, two or three words to the left or right of the keyword.

    • Print collocates table

    For sorted concordances only, there is the option of viewing a frequency table showing all the words that left or right collocate with the keyword and the number of times they co-occur, at the specified collocate distance (non-Asian corpora only). Select to have the words presented alphabetically or in order of frequency.

    • Line width / Stop after

    Vary the line width and the number of concordances returned for a particular search by changing the values in these categories.

    • Link to context

    The keyword in each concordance line is highlighted. Click on the keyword to see the wider context displayed.

    • Sentence concordancer

    Use the sentence concordance option that appears below the concordance lines to conduct another search for the same string. Each instance of the search string will then be presented in its sentential context in the resulting search output (i.e. the complete sentence is displayed).

    Go to Top..