top of page

Python Token & Character Sets

Token is the smallest individual unit of a program..

Introduction

A python program is called Script, It is a sequence of definitions and commands. These definitions are evaluated and commands are executed by the Python Interpreter, called Python Shell.


Python Syntax - refers to a set of rules that defines how users and the system should write and interpret a Python program.


Basic Elements of Python Program


Every Python program contains some basic elements like character set, tokens, expressions, statements, input & output.


Character Set – A valid set of characters that a language can recognize. Python support Unicode encoding standard and it has following character sets –

Letters – A to Z , a – z

Digits – 0 to 9

Special Symbol – All symbols available on the keyboard.

Whitespaces – Blank space, Tabs, Newline, Carriage return

Other Characters – All ASCII and UNICODE characters


Tokens – Lexical units or Lexical elements of a program, that is used for the building of statement and instructions in the program called Tokens.

Python has following tokens –

(i) Keyword (ii) Identifier  (iii) Literals (iv) Operators  (v) Punctuators


1. Keywords – 


Keywords are the words that convey a special meanings to the python interpreter. These are reserved for special purpose and must not be used as identifier. 

Python programming language has following keywords –

False None True and as assert break class continue def del

elif  else except  finally for from global if import in  is

lambda  nonlocal not or pass raise return try while with yield


For checking / displaying the list of keywords available in Python, you have to write the following two statements:-


>>>import keyword


>>> print(keyword.kwlist)

['False', 'None', 'True', 'and', 'as', 'assert', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'nonlocal', 'not', 'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield']


"All these keywords are in small alphabets except for False, None, and True, which start with capital alphabets.


2. Identifier – 


Name given to different parts of a program like variables, objects, class, method, module, list, dictionaries, is called identifier.


Identifier rules of Python –

i) Identifier is the long sequence of letters and digits. Eg. rateofinterest, listofstudents,

ii) First character must be letter; the underscore (_). E.g. _sum, sum, _1sum

iii) It is case-sensitive. E.g. Sum and sum both are different, due to case sensitivity.

iv) Must not be a keyword. if, else, break … not allowed.

v) Can’t be special character except underscore (_) e.g. ^sum, sum#.. not allowed due special character.

vi) White space not allowed. E.g. rate of interest, simple interest , not allowed due to space


Example of Valid Identifiers –

Unit_per_day, dayofweek, stud_name, father_name, TotolaMarks, age, amount, a24, invoice_no


Example of Invalid Identifiers-

3rdGraph - Name can’t start with a digit

Roll#No - Special symbol # is not allowed

First Name -  Spaces are not allowed.

D.O.B. -  Special symbol . (dots) are not allowed.

while - Keyword not allowed.


3. Literals / Values – 


The fixed value or data items used in the program, called Literals. Eg. “Anjeev Singh”, 38, 58690.36, True, False, None, etc.


Python allows five kinds of literals –


String Literals,

Numeric Literals,

Boolean Literals,

Special Character None,

Literal Collection


a) String Literals – The text written inside the quotes are called String literals. In python, string literals can form by enclosing text in both forms of quotes – single quotes or double quotes or triple quotes.

For example – ‘a’,   “a”,  ‘anjeev kumar singh’,  “anjeev singh academy”.

“”” hello how are

You, I am fine,

Thank you

“””

String types in Python - Python allows two types of strings –

i) Single-line String

ii) Multi-line String


Single Line String – String written inside the single quote (‘ ‘) or double quote (“ “) are called single line string.


Multi-line String – String containing multiple lines called Multi-line string. It can be written in two ways – By adding backslash (\) and By typing text in triple quotes


For Example

Str1 = “Hello\ # \ (backslash) not counted as character

How are you?\

I am fine.”

Str2 = “””Hello # EOL counted as character

How are You?

I am fine.”””

>>> len(str1)

27

>>> len(str2)

29


b) Numeric Literals – Literals written in the form of number, positive or negative, whole or fractional, called Numeric Literals. Python offers three types of numeric literals –


Integer Literals (int) - integers or ints, are positive or negative whole numbers with no decimal point. Commas cannot appear in integer constant.

a. Decimal Integer: An integer contain digits. E.g. – 1254, +589, -987

b. Octal Integer : Digit starting with 0o (Zero followed letter o) e.g.0o27, 0o35

c. Hexadecimal Integer: Digit preceded by 0x or OX. E.g. 0x19A, 0xFA


Floating Point Literals (float) – real numbers, written with a decimal point with integer and fractional part.


a. Fractional Form – A real number at least must have one digit with the decimal point, either before or after.

Example – 89.6589, 0.56898, 1.265, 25.0


b.    Exponent Form – A real number containing two parts – mantissa and an exponent. The mantissa is followed by a letter E or e and the exponent. Mantissa may be either integer or real number while exponent must be a +ve or –ve integer.

Example – 0.125E25, -14.26e21


Complex Literals (complex) – are of the form of a + b j, where a is the real part of the number and b is the imaginary part. j represents   , which is an imaginary number.


c) Boolean Literals – True (Boolean true) or False (Boolean false) is the two types of Boolean literals in python.  Boolean Literals are – True and False.


d) Special Literals None – None is the special literal in Python. It is used to indicate nothing or no value or absence of value. In case of List, it is used to indicate the end of list.

  • In Python, None means, “There is no useful information”, or “There is nothing here”.

  • It display nothing when you write variable name, which containing None, at prompt.

Note

(1) True , False and None are keywords, but these are start with capital letter, rest all keywords are written in small letters.

(2) Boolean literals True, False and Special literal None are built-in constants/literals of Python.


e) Literals Collections


List  - is a list of comma separated values of any data type between square brackets. It can be changed. E.g.

p = [1, 2, 3, 4] m = [‘a’, ‘e’, ‘i’, ‘o’, ‘u’]

q = [“Anjeev”, “kumar”, “singh”]

r = [“Mohit”, 102, 85.2, True]


Tuple - is a list of comma separated values of any data type between parentheses. It cannot be changed. Eg.

p = (1, 2, 3, 4) m = (‘a’, ‘e’, ‘i’, ‘o’, ‘u’)

q = (“Anjeev”, “kumar”, “singh”)

r = (“Mohit”, 102, 85.2, ‘M’, True)


Dictionary – is an unordered set of comma-separated key : value pairs within curly braces.

rec = {‘name’:“Mohit”, ‘roll’: 102, ‘marks’ : 85.2, ‘sex’: ‘M’, ‘ip’: True}

2.2.2.4 


4. Operators– 


Operators are tokens that perform some operation / calculation / computation, in an expression.

  • Variables and objects to which the computation or operation is applied, are called operands.

  • Operator require some operands to work upon


5. Punctuators / Delimiters  – 


Punctuators are symbols that are used in programming languages to organize programming sentence structures.

Most common punctuators of Python programming language are:

( )  [ ] { }  ’  : . ; @  =  += -= *= /= //= %= @= &= |= ^= >>= <<= **=

Python Tokens
bottom of page